How To Install Apache Kafka on AlmaLinux 9
In this tutorial, we will show you how to install Apache Kafka on AlmaLinux 9. Apache Kafka is a distributed streaming platform that allows you to build real-time data pipelines and streaming applications. It’s designed to handle high-throughput, fault-tolerant, and scalable data streams, making it an excellent choice for modern data architectures. AlmaLinux 9, as a binary-compatible alternative to Red Hat Enterprise Linux (RHEL), provides a solid foundation for running Kafka in production environments.
In this tutorial, we’ll cover everything you need to know to get Kafka up and running on AlmaLinux 9. From system preparation to configuration and basic usage, you’ll have a fully functional Kafka installation by the end of this guide.
Prerequisites
Before we dive into the installation process, let’s ensure you have everything you need:
- A machine running AlmaLinux 9 (physical or virtual)
- Root access or sudo privileges
- Minimum of 2GB RAM (4GB or more recommended for production)
- At least 5GB of free disk space
- Internet connectivity for downloading packages
Additionally, you’ll need to configure your firewall to allow the following ports:
- 2181 (ZooKeeper)
- 9092 (Kafka)
If you’re using firewalld, you can open these ports with the following commands:
sudo firewall-cmd --permanent --add-port=2181/tcp
sudo firewall-cmd --permanent --add-port=9092/tcp
sudo firewall-cmd --reload
System Preparation
Let’s start by updating your system and installing some essential utilities:
sudo dnf update -y
sudo dnf install -y wget tar nc
Next, we’ll create a dedicated user and group for Kafka. This is a best practice for security and management purposes:
sudo groupadd kafka
sudo useradd -g kafka -m -s /bin/bash kafka
Now, let’s set up some environment variables. Add the following lines to the /etc/profile.d/kafka.sh
file:
sudo tee /etc/profile.d/kafka.sh << EOF
export KAFKA_HOME=/opt/kafka
export PATH=\$PATH:\$KAFKA_HOME/bin
EOF
Apply the changes by sourcing the file:
source /etc/profile.d/kafka.sh
Java Installation
Apache Kafka requires Java to run. We’ll install OpenJDK 11, which is compatible with the latest versions of Kafka:
sudo dnf install -y java-11-openjdk-devel
Verify the Java installation:
java -version
You should see output indicating that Java 11 is installed. Next, let’s set the JAVA_HOME environment variable:
sudo tee -a /etc/profile.d/java.sh << EOF
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))
export PATH=\$PATH:\$JAVA_HOME/bin
EOF
Apply the changes:
source /etc/profile.d/java.sh
Apache Kafka Installation
Now we’re ready to install Kafka. First, download the latest stable version of Kafka:
wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz
Create the installation directory and extract the archive:
sudo mkdir -p /opt/kafka
sudo tar -xzf kafka_2.13-3.4.0.tgz -C /opt/kafka --strip-components=1
Set the correct ownership and permissions:
sudo chown -R kafka:kafka /opt/kafka
sudo chmod -R 755 /opt/kafka
Configuration Setup
Kafka’s main configuration file is server.properties
. Let’s make a backup of the original file and create a new one with our desired settings:
sudo cp /opt/kafka/config/server.properties /opt/kafka/config/server.properties.bak
sudo tee /opt/kafka/config/server.properties << EOF
broker.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://your_server_ip:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/lib/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
EOF
Replace your_server_ip
with your actual server IP address. This configuration sets up a basic single-broker Kafka cluster. For production environments, you’ll want to adjust these settings based on your specific requirements.
Service Configuration
To ensure Kafka starts automatically on system boot, we’ll create systemd service files for both ZooKeeper and Kafka.
First, create the ZooKeeper service file:
sudo tee /etc/systemd/system/zookeeper.service << EOF
[Unit]
Description=Apache ZooKeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
User=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
EOF
Now, create the Kafka service file:
sudo tee /etc/systemd/system/kafka.service << EOF [Unit] Description=Apache Kafka Server Documentation=http://kafka.apache.org/documentation.html Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/kafka.log 2>&1'
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
EOF
Reload the systemd daemon to recognize the new service files:
sudo systemctl daemon-reload
Starting and Testing
Now that we have everything set up, let’s start the ZooKeeper and Kafka services:
sudo systemctl start zookeeper
sudo systemctl start kafka
Verify that both services are running:
sudo systemctl status zookeeper
sudo systemctl status kafka
If everything is working correctly, you should see “active (running)” in the output for both services.
To ensure Kafka and ZooKeeper start automatically on system boot, enable the services:
sudo systemctl enable zookeeper
sudo systemctl enable kafka
Basic Usage Examples
Let’s test our Kafka installation by creating a topic and sending some messages.
Create a new topic called “test-topic”:
/opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test-topic
Verify that the topic was created:
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Now, let’s send some messages to the topic using the console producer:
/opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic
Type a few messages and press Ctrl+D to exit.
To consume these messages, open a new terminal and run:
/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning
You should see the messages you sent earlier.
Troubleshooting Guide
If you encounter issues during installation or operation, here are some common problems and their solutions:
- Kafka won’t start: Check the Kafka log file at
/opt/kafka/kafka.log
for error messages. - Connection refused errors: Ensure that the firewall is configured correctly and that the advertised listeners in
server.properties
are set to the correct IP address. - Out of memory errors: Increase the Java heap size by modifying the
KAFKA_HEAP_OPTS
environment variable in/opt/kafka/bin/kafka-server-start.sh
. - Permission denied errors: Verify that the kafka user has the correct permissions on the Kafka installation directory and log directories.
Security Considerations
While this guide sets up a basic Kafka installation, it’s crucial to implement proper security measures for production environments:
- Enable SSL/TLS encryption for client-broker communication
- Implement SASL authentication for client connections
- Set up ACLs (Access Control Lists) to control access to topics and resources
- Regularly update Kafka and its dependencies to patch security vulnerabilities
- Use security tools like Kafka’s built-in authorizer or third-party solutions for fine-grained access control
Congratulations! You have successfully installed Apache Kafka. Thanks for using this tutorial for installing Apache Kafka distributed streaming platform on your AlmaLinux 9 system. For additional help or useful information, we recommend you check the official Apache website.