How To Install Apache Kafka on Ubuntu 24.04 LTS
Apache Kafka has emerged as a cornerstone in modern data architectures, providing a robust and scalable platform for handling real-time data streams. As organizations increasingly rely on fast, reliable data processing, Kafka’s popularity continues to grow. This comprehensive guide will walk you through the process of installing Apache Kafka on Ubuntu 24.04 LTS, ensuring you have a solid foundation for your data streaming needs.
Prerequisites
Before diving into the installation process, make sure your system meets the following requirements:
- A server running Ubuntu 24.04 LTS with at least 2GB of RAM and 2 CPU cores
- Root or sudo access to the server
- A stable internet connection for downloading necessary packages
Additionally, you’ll need to install the following software:
- Java Development Kit (JDK) 11 or later
- wget or curl for downloading Kafka binaries
Ensure you have sufficient permissions to create directories, modify configuration files, and manage system services.
Understanding Apache Kafka
Apache Kafka is a distributed event streaming platform capable of handling trillions of events a day. Initially developed by LinkedIn, Kafka is now an open-source project used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
Key features of Kafka include:
- High throughput and low latency
- Scalability and fault tolerance
- Durability and reliability
- Distributed architecture
Kafka finds applications in various scenarios, such as:
- Real-time analytics
- Log aggregation
- Event-driven architectures
- Stream processing
Preparing the Ubuntu 24.04 LTS Environment
Begin by updating your Ubuntu system to ensure you have the latest packages and security updates:
sudo apt update && sudo apt upgrade -y
Next, install the Java Development Kit (JDK). Kafka requires Java to run, so let’s install OpenJDK:
sudo apt install openjdk-11-jdk -y
Verify the Java installation by checking the version:
java -version
Create a dedicated user for running Kafka. This improves security by isolating the Kafka process:
sudo adduser --system --no-create-home --disabled-password --disabled-login kafka
Downloading and Extracting Apache Kafka
Visit the Apache Kafka downloads page to find the latest stable version. At the time of writing, we’ll use Kafka 3.5.0. Download the binary using wget:
wget https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz
Extract the downloaded archive:
tar -xzf kafka_2.13-3.8.0.tgz
Move the extracted directory to a more appropriate location:
sudo mv kafka_2.13-3.8.0 /opt/kafka
Configuring Apache Kafka
Create necessary directories for Kafka logs:
sudo mkdir /var/lib/kafka
sudo mkdir /var/lib/kafka/data
Set the correct ownership for these directories:
sudo chown -R kafka:kafka /opt/kafka
sudo chown -R kafka:kafka /var/lib/kafka
Now, let’s modify the Kafka configuration. Open the server.properties file:
sudo nano /opt/kafka/config/server.properties
Update the following lines:
log.dirs=/var/lib/kafka/data
listeners=PLAINTEXT://your_server_ip:9092
advertised.listeners=PLAINTEXT://your_server_ip:9092
Replace “your_server_ip” with your actual server IP address or domain name.
Next, configure Zookeeper. Open the zookeeper.properties file:
sudo nano /opt/kafka/config/zookeeper.properties
Update the dataDir line:
dataDir=/var/lib/kafka/zookeeper
Creating Systemd Unit Files
To manage Kafka as a system service, create systemd
unit files. First, for Zookeeper:
sudo nano /etc/systemd/system/zookeeper.service
Add the following content:
[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
User=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Now, create a service file for Kafka:
sudo nano /etc/systemd/system/kafka.service
Add the following content:
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /dev/null 2>&1'
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Reload the systemd daemon to recognize the new service files:
sudo systemctl daemon-reload
Starting and Testing Kafka Services
Start the Zookeeper service:
sudo systemctl start zookeeper
Verify that Zookeeper is running:
sudo systemctl status zookeeper
If everything looks good, start the Kafka service:
sudo systemctl start kafka
Check the Kafka service status:
sudo systemctl status kafka
Enable both services to start on boot:
sudo systemctl enable zookeeper
sudo systemctl enable kafka
Creating Topics and Testing Kafka Installation
To ensure Kafka is working correctly, let’s create a test topic:
/opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test
Verify the topic creation:
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Now, let’s produce some messages to the topic:
echo "Hello, Kafka!" | /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
To consume messages from the topic, run:
/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
You should see the message “Hello, Kafka!” appear in the console.
Securing Apache Kafka
Security is crucial for production Kafka deployments. Here are some steps to enhance Kafka security:
- Enable SSL/TLS encryption for client-broker communication
- Implement SASL authentication for client connections
- Set up Access Control Lists (ACLs) to manage topic-level permissions
To enable SSL, modify the server.properties file and add:
listeners=SSL://your_server_ip:9093
ssl.keystore.location=/path/to/kafka.server.keystore.jks
ssl.keystore.password=keystore_password
ssl.key.password=key_password
ssl.truststore.location=/path/to/kafka.server.truststore.jks
ssl.truststore.password=truststore_password
Remember to generate the necessary SSL certificates and keystores before enabling SSL.
Troubleshooting Common Issues
When installing and running Kafka, you might encounter some common issues:
Port Conflicts
If Kafka fails to start due to port conflicts, ensure no other services are using ports 2181 (Zookeeper) and 9092 (Kafka). You can check this using:
sudo netstat -tulpn | grep LISTEN
JVM Memory Errors
If Kafka crashes due to insufficient memory, adjust the JVM heap size in the kafka-server-start.sh file:
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
Log File Analysis
Always check Kafka logs for detailed error messages:
tail -f /opt/kafka/logs/server.log
Performance Tuning and Optimization
To optimize Kafka performance, consider the following adjustments:
- Increase the number of partitions for high-throughput topics
- Adjust the retention period and size for topics
- Fine-tune producer batch size and linger time
- Optimize consumer fetch size and max poll records
Monitor Kafka performance using tools like Kafka Manager or Confluent Control Center to identify bottlenecks and areas for improvement.
Congratulations! You have successfully installed Kafka. Thanks for using this tutorial for installing Apache Kafka on the Ubuntu 24.04 LTS system. For additional help or useful information, we recommend you check the official Apache Kafka website.