Exploring Kafka

Exploring Kafka

In this blog we will explore the basics of Kafka. An open-source distributed streaming platform, developed by LinkedIn & donated to Apache Software Foundation.

Kafka generally used in building real-time streaming data application. So that you can stream the live data from source systems without any delay. Following are some capabilities of kafka:

  • Publishing stream of records.
  • Store streams of record in fault tolerant way.
  • Subscribe/Consuming stream of records.

Download Kafka:

Download latest Kafka from : kafka.apache.org and un-tar it.

1

Start Zookeeper Server:

First you have to start the zookeeper server, as it was used by the kafka.

$ bin/zookeeper-server-start.sh config/zookeeper.properties

2

3

Once your zookeeper server started leave that terminal & do not close. Just not down the port number.

Start kafka-server:

Now once the zookeeper server is up and running, you can start the kafka-server.

$ bin/kafka-server-start.sh config/server.properties

4start

5done

Once your kafka-server started leave that terminal & do not close.

Create Topic:

Where producer application push the stream of record is called topic. Producer can push records in multiple topics. Lest create a sample topic for demo.

$ bin/kafka-topics.sh –zookeeper localhost:2181 –create –topic firsttopic –partitions 1 –replication-factor 1

topic

List all topics:

$ bin/kafka-topics.sh –list –zookeeper localhost:2181

You can delete topics using :

$ bin/kafka-topics.sh –delete –zookeeper localhost:2181 –topic <<topic-name>>

Produce Data:

Sending the streams of records to a kafka topic using producer.

$ bin/kafka-console-producer.sh –broker-list localhost:2181 –topic firsttopic

prodcue

Consuming Data:

Receive the streams of records from a kafka topic.

$ bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic firsttopic –from-beginning

consumedIn real-time applications you have integrate the kafka APIs for producing & consuming data. Kafka has five core APIs

  • Producer API  – To publish a stream of records to one or more Kafka topics
  • Consumer API – To subscribe to one or more topics
  • Streams API – To convert input stream into output stream
  • Connector API – Used for reusable connectors, external data source setup
  • Admin API  – For managing Kafka objects

Refer this blog for more details kafka.apache.org/intro


Thanks!

Happy Learning! Your feedback would be appreciated!