We Care Through What We Share
  • Apache Kafka is an open source, distributed, high-throughput publish-subscribe messaging system.

    If you are approaching kafka for the first time then this post to help you get running distributed kafka cluster on your system with minimal steps. In this guide, we will discuss steps to setup kafka on Ubuntu 16.04.

    The basic architecture of Kafka is organized around a few key terms:.

    Zookeeper: a coordinator between brokers and clusters.

    Topic: a topic is a category to which messages are published by the message producers.

    Brokers: broker instance can handle reads and writes message.  

    Producers: that insert data into the cluster.

    Consumers: that read data from the cluster

    Step 1: Install Java

    Kafka needs a java runtime environment:

    $sudo apt-get update
    $sudo apt-get install default-jre

    Step 2: Install Zookeeper

    Zookeeper is a key value store used to maintain server state. This is mandatory to run the kafka.

    It’s a centralized system for maintaining the configuration. It also does a job to elect the leaders.

    $sudo apt-get install zookeeperd

    Let’s check if this is alive or not.

    $telnet localhost 2181

    At prompt, enter this

    ruok

    if everything is okay then telnet session will reply this,

    imok

    Step 3: Create a service User for kafka

    Kakfa is a network application, creating a non sudo user will minimize the risk of machine compromised. Let’s create a user of it name it “kafka”

    $sudo adduser --system --no-create-home --disabled-password --disabled-login kafka

    Step 4: Installing kafka

    Download the kafka and unzip in a convenient location typically, /opt

    $cd ~
    $wget http://www-eu.apache.org/dist/kafka/1.1.0/kafka_2.11-1.1.0.tgz
    $sudo mkdir /opt/kafka
    $sudo tar -xvzf kafka_2.12-1.0.1.tgz --directory /opt/kafka --strip-components 1

    Step 5: Configure the kafka server

    As, kafka stores the data on disk, we will create a directory for it.

    $sudo mkdir /var/lib/kakfa
    $sudo mkdir /var/lib/kafka/data

    Since we will be setup the distributed setup for kafka, let’s configure the 3 brokers.

    If you open the /opt/kafka/confit/server.properties you will see many properties, BUT we will be dealing with only 3 properties. These three properties must be unique for each instance.

    broker.id=0

    listeners=PLAINTEXT://:9092

    log.dirs=/tmp/kafka-logs

    As we have 3 brokers, we will create properties file for each broker. Let’s copy the /opt/kafka/config/server.properties file and create 3 files for each instance.

    $cp opt/kafka/config/server.properties opt/kafka/config/server-1.properties
    $cp opt/kafka/config/server.properties opt/kafka/config/server-2.properties
    $cp opt/kafka/config/server.properties opt/kafka/config/server-3.properties 

    Create the log directories for each server.

    $sudo mkdir /var/lib/kakfa/data/server-1
    $sudo mkdir /var/lib/kakfa/data/server-2
    $sudo mkdir /var/lib/kakfa/data/server-3

    We will be using these directories in configuration.

    Now, make some configuration changes in each kafka server. Open this file in text editor. I am using nano.

    server-1.properties

    $sudo nano /opt/kafka/config/server-1.properties

    broker.id=1

    listeners=PLAINTEXT://:9093

    log.dirs=/var/lib/kakfa/data/server-1

    Save the changes and go to next server to edit.

    server-2.properties

    $sudo nano /opt/kafka/config/server-2.properties

    broker.id=2

    listeners=PLAINTEXT://:9094

    log.dirs=/var/lib/kakfa/data/server-2

    server-3.properties

    $sudo nano /opt/kafka/config/server-3.properties

    broker.id=3

    listeners=PLAINTEXT://:9095

    log.dirs=/var/lib/kakfa/data/server-3

    If you would like to delete the topics then you need to make edits to delete.topic.enable setting. By default, kafka doesn’t allow you to delete it. It needs to enable in configuration to do it. Please find the line and change it.

    delete.topic.enable = true

    Step 6: confirm permission of kafka directories

    We will assign permission to kafka user(created in step 3) to kafka directories.

    $sudo chown -R kafka:nogroup /opt/kafka
    $sudo chown -R kafka:nogroup /var/lib/kafka

    Step 7: Start the brokers

    Now, we can start our brokers. Run these three commands on different terminal sessions.
    $cd /opt/kafka
    $bin/kafka-server-start.sh config/server-1.properties
    $bin/kafka-server-start.sh config/server-2.properties
    $bin/kafka-server-start.sh config/server-3.properties

    You should see a startup message when the brokers start successfully.

    Test the installation

    Create a topic

    We need to create a topic first.

    $bin/kafka-topics.sh --create --topic topic-1 --zookeeper localhost:2181 --partitions 3 --replication-factor 3

    You should see a confirmation message after you create a topic.

    partition allows how many brokers you want data to be split. As, we have 3 brokers, we can set this to 3.

    replication factor allows how many copies of data you need. This is helpful when any broker down other brokers can handle the job.

    The Producer instance

    Producer feeds the data into the kafka clusters. This command will push the data into the cluster.  

    $bin/kafka-console-producer.sh --broker-list localhost:9093,localhost:9094,localhost:9095 –topic topic-1

    broker-list options have the list of brokers which we have configured.

    topic options specify under which topic you want to push the data. In our case we’ve pushed the data under topic-1

    Once you execute this command, you will see a prompt where you can enter a message. You need to hit the enter every time to create a new message.

    Consumers

    We’ve produced the message. Now let’s consume those messages. Run this command to consume the messages.

    $bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --topic topic-1 --from-beginning

    bootstrap-server is the broker which we have created. It could be any from our 3 brokers.

    from-beginning specifies read the message from the beginning.

    This command shows all the message which has been produced by the producer. You can also see the message anyone adding any message. This is possible if you are logged in a separate terminal.

    Hope this helps your setup and configure kafka on Ubuntu 16.04. Please try it and experiment.

Words From Our Clients

0Years In Operation
0Loyal Clients
0Successful Projects

Tell Us About Your Project