In blog 1 in this series, we looked at how TriggerMesh lets you declaratively define your Kafka messages and send them in the CloudEvent format to a Kubernetes workload.
Blog 2 did the reverse, and explored how to use TriggerMesh to send messages to a Kafka topic declaratively from a Kubernetes workload.
This blog will show you how to define your Kafka Sources and Sinks declaratively and manage them in your Kubernetes clusters. This means you can avoid updating and compiling a lot of Java and uploading JARs.
Streaming data into a Kafka cluster and from a Kafka cluster to somewhere else is usually done with a Kafka connect cluster and associated configurations.
This is usually a real pain point for Kafka users and produces a significant drag on velocity.
While there is a nice collection of Kafka connectors, few on Confluent cloud are fully managed, which leaves Kafka users with a lot of work to do if they want to manage their sources and sinks efficiently.
Following up on the previous two blogs where we reviewed how to produce messages to Kafka and how to consume messages from Kafka, I am now going to show you how you can define your Kafka Sources and Sinks declaratively and manage them in your Kubernetes clusters.
From the point of view of a Kafka cluster, a Kafka source is something that produces an event into a Kafka topic. With the help of Knative we can configure a Kafka source using an object called a KafkaSink this is super confusing I know and I am really sorry about it :) everything is relative and depends on where you stand.
To create an addressable endpoint in your Kubernetes cluster that will become the source of messages into your Kafka cluster you create an object like that one below:
Note the bootstrap server on Confluent cloud, and the specified topic. Equipped with this you can now create a AWS SQS source that will send messages directly into Kafka with a manifest like the one below:
Note that the AWS SQS is represented by its ARN and that you can access it thanks to a Kubernetes secret which contains AWS API keys. The sink section points to the KafkaSink previously created. With this manifest all SQS messages will be consumed and sent to Kafka using the Cloudevents specification.
Two Kubernetes objects and you have a declarative definition of your Kafka source
For your Kafka Sink, we sadly fall back to the same potential confusion: what is a Sink from the Kafka cluster perspective is a source outside of that cluster. So to define a Kafka sink, you define a KafkaSource in your Kubernetes cluster.
For example the manifest below specifies your bootstrap server pointing to a Kafka cluster in the Confluent cloud, a specific topic from which to consume messages from and a sink which is the end target of your Kafka messages. Here we point to a Kubernetes service called display :
So all you need for your Kafka Sink definition is a single Kubernetes manifest and a target micro-service exposed as a traditional Kubernetes service. Note that it could also be a Knative service in which case your target would benefit from auto-scaling including scale to zero.
Instead of a complex work flow of compiling JARs and uploading them to a Kafka connect cluster, you can leverage your Kubernetes cluster and define your Kafka sources and sinks with Kubernetes objects.
This brings a declarative mindset to your Kafka sources and sinks, gives you a source of truth and unifies the management of your Kafka flows and your micro-services. In addition this simplifies targeting micro-services running in Kubernetes when Kafka messages are present and with the addition of Knative gives you the autoscaling necessary to handle changing stream volume.
If you need a collection of Kafka sources, suddenly all Knative event sources can be used (e.g GitHub, GitLab, JIRA, Zendesk, Slack, Kinesis…)