Apache Kafka is a distributed streaming platform. Let’s explain it in more detail. Apache Kafka is three key capabilities where publish and subscribe to streams of records, similar to a message queue or enterprise messaging system. Apache Kafka provides a distributed publish-subscribe messaging system and robust queue that can handle a high volume of data and enables us to pass message consumption.
Apache Kafka Advantages
Apache Kafka provides a lot of benefits to the owners but some of them are very important where they are listed below.
Reliability
is provided by Kafka because it is distributed, partitioned, replicated, and fault tolerance.Scalability
is provided by Kafka where it provides no downtime.Durability
is provided by Kafka where messages persist on disks as fast as possible.Performance
is provided by Kafka where a high volume of messages for publishing and subscribing. It can provide stable performance with event TB’s of messages stored.
Download and Install Kafka For Linux and Windows
We can install Apache Kafka into the Linux, Ubuntu, Mint, Debian, Fedora, CentOS and Windows operating systems where Kafka is Java-based software. If we can install Java into these operating system we can run Kafka easily.
Download Apache Kafka
We will download Apache Kafka from the following link. This link provides us nearest mirror to download.
https://www.apache.org/dyn/closer.cgi?path=/kafka/2.2.0/kafka_2.12-2.2.0.tgz

In this case, we will download on Ubuntu with the wget
command .
$ wget http://kozyatagi.mirror.guzel.net.tr/apache/kafka/2.2.0/kafka_2.12-2.2.0.tgz

Extract Downloaded File
We will extract the downloaded file with the tar
command like below.
$ tar xvf kafka_2.12-2.2.0.tgz

And we will enter to the extracted directory
$ cd kafka_2.12-2.2.0/
Start ZooKeeper Server
Apache Kafka is managed with the ZooKeeper. So in order to start Kafka, we will start the ZooKeeper Server with the provided configuration. We will use zookeeper-server-start.sh
bash script by providing provided default configuration zookeeper.properties
.
$ ./bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka Server
Now we can start the Kafka Server by using kafka-server-start.sh
with the configuration file named server.properties
.
$ ./bin/kafka-server-start.sh ./config/server.properties

We will see the configuration like below in the console messages.

Create A Topic
We can create a topic with the kafka-topics.sh
to create a topic named poftut
.
$ bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic poftut
Then we can list existing topics with the --list
parameter like below.
$ bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Send Some Message
We can send some message to the created topic with the kafka-console-consumer.sh
like below.
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic poftut

Start A Consumer
We can consume the messages in the provided topic with the kafka-console-consumer.sh
like below.
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic poftut --from-beginning

Apache Kafka Use Cases
Apache Kafka can be used in different cases. In this part, we will list the most convenient and popular of them.
Messaging
Kafka can be used for message broker. Kafka is designed as a robust, stable, high-performance message delivery. In comparison to most messaging systems, Kafka has better throughput, built-in partitioning, replication, and fault tolerance which makes is a good solution from small scale to large scale message processing applications.
Website Activity Tracking
The original use case for Kafka was able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds. Activity tracking is often high volume as many activity messages are generated for each user page view.
Metrics
Kafka can be used as operational data monitoring. This involves tracking and distributing metric to the different producers and consumers like web, mobile, desktop.
Log Aggregation
Log aggregation is a hard job to accomplish. Kafka can be used for log aggregation from different producers and senders to centrally collect them and provide other log components like SIEM, Log Archiver, etc.
Stream Processing
Apache Kafka can process stream data easily. Stream processing can be done in multiple stages where input raw data can be aggregated, enriched, or transformed into new topics for further consumption.
Commit Log
Kafka can be used as external commit-log for the distributed system. The log can be used to replicate data between nodes and act as a re-sync mechanism.
Apache Kafka Architecture
Apache Kafka has a very simple architecture from the user’s point of view. There are different actors that are used to create, connect, process, and consume data with the Kafka system.

Producers
Producers will create data and provide this data into the Kafka system via different ways like APIs. Producers also use different programming language SDK and library to push data created by them.
Connectors
Connectors are used to create scalable and reliable data streaming between Apache Kafka and other systems. Thes systems can be databases, file systems, etc.
Stream Processors
In some cases, input data should be processed or transformed. Stream Processors are used to processing and transform different topics and input data and output to different topics or consumers.
Consumers
Consumers are the entities that are mainly getting, using, consuming Kafka provided data. Consumers can use data from different topics that can be processed or not.