What Is Apache Kafka, Use Cases, Advantages and How To Install and Use Apache Kafka? – POFTUT

What Is Apache Kafka, Use Cases, Advantages and How To Install and Use Apache Kafka?


Apache Kafka is a distributed streaming platform. Let’s explain it in more detail. Apache Kafka is three key capabilities where publish and subscribe to streams of records, similar to a message queue or enterprise messaging system. Apache Kafka provides a distributed publish-subscribe messaging system and robust queue that can handle a high volume of data and enables us to pass message consumption.

Apache Kafka Advantages

Apache Kafka provides a lot of benefits to the owners but some of them are very important where they are listed below.

  • Reliability is provided by Kafka because it is distributed, partitioned, replicated, and fault tolerance.
  • Scalability is provided by Kafka where it provides no downtime.
  • Durability is provided by Kafka where messages persist on disks as fast as possible.
  • Performance is provided by Kafka where a high volume of messages for publishing and subscribing.  It can provide stable performance with event TB’s of messages stored.

Download and Install Kafka For Linux and Windows

We can install Apache Kafka into the Linux, Ubuntu, Mint, Debian, Fedora, CentOS and Windows operating systems where Kafka is Java-based software. If we can install Java into these operating system we can run Kafka easily.

Download Apache Kafka

We will download Apache Kafka from the following link. This link provides us nearest mirror to download.

https://www.apache.org/dyn/closer.cgi?path=/kafka/2.2.0/kafka_2.12-2.2.0.tgz

Download Apache Kafka
Download Apache Kafka

In this case, we will download on Ubuntu with the wget command .

$ wget http://kozyatagi.mirror.guzel.net.tr/apache/kafka/2.2.0/kafka_2.12-2.2.0.tgz
Download Apache Kafka
Download Apache Kafka

Extract Downloaded File

We will extract the downloaded file with the tar command like below.

$ tar xvf kafka_2.12-2.2.0.tgz
Extract Downloaded File
Extract Downloaded File

And we will enter to the extracted directory

$ cd kafka_2.12-2.2.0/

Start ZooKeeper Server

Apache Kafka is managed with the ZooKeeper. So in order to start Kafka, we will start the ZooKeeper Server with the provided configuration. We will use zookeeper-server-start.sh bash script by providing provided default configuration zookeeper.properties.

$ ./bin/zookeeper-server-start.sh config/zookeeper.properties
Start ZooKeeper Server
Start ZooKeeper Server

Start Kafka Server

Now we can start the Kafka Server by using kafka-server-start.sh with the configuration file named server.properties.

$ ./bin/kafka-server-start.sh ./config/server.properties
Start Kafka Server
Start Kafka Server

We will see the configuration like below in the console messages.

Start Kafka Server
Start Kafka Server

Create A Topic

We can create a topic with the kafka-topics.sh to create a topic named poftut.

$ bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic poftut

Then we can list existing topics with the --list parameter like below.

$ bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Create A Topic
Create A Topic

Send Some Message

We can send some message to the created topic with the kafka-console-consumer.sh like below.

$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic poftut
Send Some Message
Send Some Message

Start A Consumer

We can consume the messages in the provided topic with the kafka-console-consumer.sh like below.

$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic poftut --from-beginning
Start A Consumer
Start A Consumer

Apache Kafka Use Cases

Apache Kafka can be used in different cases. In this part, we will list the most convenient and popular of them.

Messaging

Kafka can be used for message broker. Kafka is designed as a robust, stable, high-performance message delivery.  In comparison to most messaging systems, Kafka has better throughput, built-in partitioning, replication, and fault tolerance which makes is a good solution from small scale to large scale message processing applications.

LEARN MORE  What Is Webpage?

Website Activity Tracking

The original use case for Kafka was able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds. Activity tracking is often high volume as many activity messages are generated for each user page view.

Metrics

Kafka can be used as operational data monitoring. This involves tracking and distributing metric to the different producers and consumers like web, mobile, desktop.

Log Aggregation

Log aggregation is a hard job to accomplish. Kafka can be used for log aggregation from different producers and senders to centrally collect them and provide other log components like SIEM, Log Archiver, etc.

Stream Processing

Apache Kafka can process stream data easily. Stream processing can be done in multiple stages where input raw data can be aggregated, enriched, or transformed into new topics for further consumption.

Commit Log

Kafka can be used as external commit-log for the distributed system. The log can be used to replicate data between nodes and act as a re-sync mechanism.

Apache Kafka Architecture

Apache Kafka has a very simple architecture from the user’s point of view. There are different actors that are used to create, connect, process, and consume data with the Kafka system.

Apache Kafka Architecture
Apache Kafka Architecture

Producers

Producers will create data and provide this data into the Kafka system via different ways like APIs. Producers also use different programming language SDK and library to push data created by them.

Connectors

Connectors are used to create scalable and reliable data streaming between Apache Kafka and other systems. Thes systems can be databases, file systems, etc.

LEARN MORE  How To Create phpinfo Pages?

Stream Processors

In some cases, input data should be processed or transformed. Stream Processors are used to processing and transform different topics and input data and output to different topics or consumers.

Consumers

Consumers are the entities that are mainly getting, using, consuming Kafka provided data. Consumers can use data from different topics that can be processed or not.

Leave a Comment