AlmaLinuxRHEL Based

How To Install Apache Kafka on AlmaLinux 9

Install Apache Kafka on AlmaLinux 9

In this tutorial, we will show you how to install Apache Kafka on AlmaLinux 9. Apache Kafka is a distributed streaming platform that allows you to build real-time data pipelines and streaming applications. It’s designed to handle high-throughput, fault-tolerant, and scalable data streams, making it an excellent choice for modern data architectures. AlmaLinux 9, as a binary-compatible alternative to Red Hat Enterprise Linux (RHEL), provides a solid foundation for running Kafka in production environments.

In this tutorial, we’ll cover everything you need to know to get Kafka up and running on AlmaLinux 9. From system preparation to configuration and basic usage, you’ll have a fully functional Kafka installation by the end of this guide.

Prerequisites

Before we dive into the installation process, let’s ensure you have everything you need:

  • A machine running AlmaLinux 9 (physical or virtual)
  • Root access or sudo privileges
  • Minimum of 2GB RAM (4GB or more recommended for production)
  • At least 5GB of free disk space
  • Internet connectivity for downloading packages

Additionally, you’ll need to configure your firewall to allow the following ports:

  • 2181 (ZooKeeper)
  • 9092 (Kafka)

If you’re using firewalld, you can open these ports with the following commands:

sudo firewall-cmd --permanent --add-port=2181/tcp
sudo firewall-cmd --permanent --add-port=9092/tcp
sudo firewall-cmd --reload

System Preparation

Let’s start by updating your system and installing some essential utilities:

sudo dnf update -y
sudo dnf install -y wget tar nc

Next, we’ll create a dedicated user and group for Kafka. This is a best practice for security and management purposes:

sudo groupadd kafka
sudo useradd -g kafka -m -s /bin/bash kafka

Now, let’s set up some environment variables. Add the following lines to the /etc/profile.d/kafka.sh file:

sudo tee /etc/profile.d/kafka.sh << EOF
export KAFKA_HOME=/opt/kafka
export PATH=\$PATH:\$KAFKA_HOME/bin
EOF

Apply the changes by sourcing the file:

source /etc/profile.d/kafka.sh

Java Installation

Apache Kafka requires Java to run. We’ll install OpenJDK 11, which is compatible with the latest versions of Kafka:

sudo dnf install -y java-11-openjdk-devel

Verify the Java installation:

java -version

You should see output indicating that Java 11 is installed. Next, let’s set the JAVA_HOME environment variable:

sudo tee -a /etc/profile.d/java.sh << EOF
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))
export PATH=\$PATH:\$JAVA_HOME/bin
EOF

Apply the changes:

source /etc/profile.d/java.sh

Apache Kafka Installation

Now we’re ready to install Kafka. First, download the latest stable version of Kafka:

wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz

Create the installation directory and extract the archive:

sudo mkdir -p /opt/kafka
sudo tar -xzf kafka_2.13-3.4.0.tgz -C /opt/kafka --strip-components=1

Set the correct ownership and permissions:


sudo chown -R kafka:kafka /opt/kafka
sudo chmod -R 755 /opt/kafka

Configuration Setup

Kafka’s main configuration file is server.properties. Let’s make a backup of the original file and create a new one with our desired settings:

sudo cp /opt/kafka/config/server.properties /opt/kafka/config/server.properties.bak
sudo tee /opt/kafka/config/server.properties << EOF
broker.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://your_server_ip:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/lib/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
EOF

Replace your_server_ip with your actual server IP address. This configuration sets up a basic single-broker Kafka cluster. For production environments, you’ll want to adjust these settings based on your specific requirements.

Service Configuration

To ensure Kafka starts automatically on system boot, we’ll create systemd service files for both ZooKeeper and Kafka.

First, create the ZooKeeper service file:

sudo tee /etc/systemd/system/zookeeper.service << EOF
[Unit]
Description=Apache ZooKeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target
EOF

Now, create the Kafka service file:

sudo tee /etc/systemd/system/kafka.service << EOF [Unit] Description=Apache Kafka Server Documentation=http://kafka.apache.org/documentation.html Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/kafka.log 2>&1'
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target
EOF

Reload the systemd daemon to recognize the new service files:

sudo systemctl daemon-reload

Starting and Testing

Now that we have everything set up, let’s start the ZooKeeper and Kafka services:

sudo systemctl start zookeeper
sudo systemctl start kafka

Verify that both services are running:

sudo systemctl status zookeeper
sudo systemctl status kafka

If everything is working correctly, you should see “active (running)” in the output for both services.

To ensure Kafka and ZooKeeper start automatically on system boot, enable the services:

sudo systemctl enable zookeeper
sudo systemctl enable kafka

Basic Usage Examples

Let’s test our Kafka installation by creating a topic and sending some messages.

Create a new topic called “test-topic”:

/opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test-topic

Verify that the topic was created:

/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Now, let’s send some messages to the topic using the console producer:

/opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic

Type a few messages and press Ctrl+D to exit.

To consume these messages, open a new terminal and run:

/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning

You should see the messages you sent earlier.

Troubleshooting Guide

If you encounter issues during installation or operation, here are some common problems and their solutions:

  • Kafka won’t start: Check the Kafka log file at /opt/kafka/kafka.log for error messages.
  • Connection refused errors: Ensure that the firewall is configured correctly and that the advertised listeners in server.properties are set to the correct IP address.
  • Out of memory errors: Increase the Java heap size by modifying the KAFKA_HEAP_OPTS environment variable in /opt/kafka/bin/kafka-server-start.sh.
  • Permission denied errors: Verify that the kafka user has the correct permissions on the Kafka installation directory and log directories.

Security Considerations

While this guide sets up a basic Kafka installation, it’s crucial to implement proper security measures for production environments:

  • Enable SSL/TLS encryption for client-broker communication
  • Implement SASL authentication for client connections
  • Set up ACLs (Access Control Lists) to control access to topics and resources
  • Regularly update Kafka and its dependencies to patch security vulnerabilities
  • Use security tools like Kafka’s built-in authorizer or third-party solutions for fine-grained access control

Congratulations! You have successfully installed Apache Kafka. Thanks for using this tutorial for installing Apache Kafka distributed streaming platform on your AlmaLinux 9 system. For additional help or useful information, we recommend you check the official Apache website.

VPS Manage Service Offer
If you don’t have time to do all of this stuff, or if this is not your area of expertise, we offer a service to do “VPS Manage Service Offer”, starting from $10 (Paypal payment). Please contact us to get the best deal!

r00t

r00t is an experienced Linux enthusiast and technical writer with a passion for open-source software. With years of hands-on experience in various Linux distributions, r00t has developed a deep understanding of the Linux ecosystem and its powerful tools. He holds certifications in SCE and has contributed to several open-source projects. r00t is dedicated to sharing her knowledge and expertise through well-researched and informative articles, helping others navigate the world of Linux with confidence.
Back to top button