How To Install Apache Kafka on Rocky Linux 10

Apache Kafka has become the backbone of modern data architectures, powering real-time data pipelines and streaming applications for thousands of companies worldwide. This distributed streaming platform excels at handling high-throughput, fault-tolerant publish-subscribe messaging systems that process millions of events per second. Rocky Linux 10, with its enterprise-grade stability and RHEL compatibility, provides an ideal foundation for hosting Kafka clusters in production environments. This comprehensive guide walks you through every step of installing and configuring Apache Kafka on Rocky Linux 10, from initial system preparation to verification testing. By the end of this tutorial, you’ll have a fully operational Kafka installation ready to handle your streaming data workloads.
Prerequisites
Before diving into the installation process, ensure your system meets the necessary requirements for running Apache Kafka smoothly.
System Requirements
Your Rocky Linux 10 server needs adequate resources to support Kafka operations effectively. A minimum of 4GB RAM is required, though 8GB or more is strongly recommended for production deployments. Allocate at least 20GB of free disk space to accommodate Kafka binaries, log files, and message data. You’ll need root or sudo privileges to install packages and configure system services. A stable internet connection is essential for downloading Apache Kafka and its dependencies.
Technical Requirements
Apache Kafka runs on the Java Virtual Machine, requiring Java 11 or higher. For newer Kafka versions (3.0+), Java 17 or later is recommended for optimal performance and security updates. Basic familiarity with Linux command-line operations will help you navigate the installation process smoothly. Secure SSH access to your server enables remote administration and configuration tasks.
Network Requirements
Kafka uses specific network ports for communication. Port 9092 serves as the default broker communication port, while port 9093 handles controller traffic in KRaft mode. If you’re using the traditional ZooKeeper architecture, port 2181 must also be accessible. Understanding firewall configuration basics ensures proper network connectivity for your Kafka cluster.
Modern Kafka deployments can choose between KRaft mode (ZooKeeper-free) and traditional ZooKeeper-based setups. KRaft mode represents the future of Kafka architecture, eliminating external dependencies and simplifying cluster management.
Step 1: Update Your Rocky Linux 10 System
Maintaining an up-to-date system forms the foundation of secure and stable software installations. Begin by refreshing your package repository metadata and upgrading existing packages.
Open your terminal and execute the following command to update the package index:
sudo dnf update -y
This command synchronizes your local package database with remote repositories, ensuring you have access to the latest software versions. Next, upgrade all installed packages to their current versions:
sudo dnf upgrade -y
Install essential utilities that you’ll need throughout the Kafka setup process:
sudo dnf install wget tar curl git unzip -y
These tools facilitate file downloads, archive extraction, and version control operations. If your system update included kernel modifications, reboot your server to apply the changes:
sudo reboot
After rebooting, reconnect to your server via SSH. Your Rocky Linux 10 system is now current and ready for Kafka installation.
Step 2: Install Java Development Kit (JDK)
Why Java is Required
Apache Kafka’s entire architecture is built on the Java Virtual Machine, making Java an absolute prerequisite. Kafka requires a minimum of Java 11, though Java 17 or later delivers improved performance and security features for Kafka 3.0 and above versions.
Installing OpenJDK 11
Rocky Linux 10 provides OpenJDK packages through its default repositories. Install Java 11 with the following command:
sudo dnf install java-11-openjdk java-11-openjdk-devel -y
The java-11-openjdk package contains the Java Runtime Environment, while java-11-openjdk-devel includes development tools and libraries necessary for running Kafka services.
Alternative: Installing Java 17
For enhanced performance and access to the latest JVM optimizations, consider installing Java 17:
sudo dnf install java-17-openjdk java-17-openjdk-devel -y
Java 17 offers better garbage collection algorithms and reduced memory footprint, particularly beneficial for high-throughput Kafka deployments.
Verifying Java Installation
Confirm that Java installed correctly by checking its version:
java -version
You should see output displaying the OpenJDK version number, confirming successful installation. Check the Java compiler version as well:
javac -version
Setting JAVA_HOME
Some Kafka scripts require the JAVA_HOME environment variable. Set it system-wide by editing the environment file:
echo "JAVA_HOME=$(dirname $(dirname $(readlink -f $(which java))))" | sudo tee -a /etc/environment
source /etc/environment
Verify the JAVA_HOME setting:
echo $JAVA_HOME
Step 3: Create Dedicated Kafka User
Security Best Practices
Running services as dedicated system users follows the principle of least privilege, a fundamental security concept. This approach limits potential damage if a service becomes compromised. Kafka should never run as the root user in production environments.
Creating the Kafka System User
Create a dedicated system user for running Kafka services:
sudo useradd -r -d /opt/kafka -s /usr/sbin/nologin kafka
Let’s break down this command. The -r flag creates a system account without expiration. The -d /opt/kafka parameter sets the home directory. The -s /usr/sbin/nologin option prevents interactive shell login, enhancing security.
This configuration ensures Kafka runs in a controlled environment with restricted privileges. The kafka user can execute Kafka services but cannot log in directly to the system.
Step 4: Download Apache Kafka
Navigating to Installation Directory
Change to the /opt directory, the standard location for third-party software on Linux systems:
cd /opt
This directory provides a clean separation between system-managed software and manually installed applications.
Downloading Latest Kafka Release
Download the latest Apache Kafka binary distribution from the official mirror network:
sudo wget https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz
The filename follows the convention kafka_[scala-version]-[kafka-version].tgz. The “2.13” refers to the Scala version used to build Kafka, while “3.8.0” is the Kafka version.
Alternatively, use curl if you prefer:
sudo curl -O https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz
Verifying Download
Check that the file downloaded completely by examining its size:
ls -lh kafka_2.13-3.8.0.tgz
For production environments, verify the download’s integrity using checksums available on the Apache Kafka website.
Understanding Binary vs Source
Binary distributions come pre-compiled and ready to run, making them ideal for most installations. Source distributions require compilation and are typically used only for custom builds or when contributing to Kafka development.
Step 5: Extract and Configure Kafka Installation
Extracting the Archive
Extract the downloaded Kafka archive using tar:
sudo tar -xzf kafka_2.13-3.8.0.tgz
The tar command uses three flags: -x extracts files, -z handles gzip compression, and -f specifies the filename.
Renaming and Moving Directory
Rename the extracted directory to a simpler path:
sudo mv kafka_2.13-3.8.0 kafka
This creates a version-agnostic path at /opt/kafka, simplifying configuration and future upgrades.
Setting Proper Ownership
Transfer ownership of the entire Kafka directory to the kafka user:
sudo chown -R kafka:kafka /opt/kafka
The -R flag applies ownership recursively to all subdirectories and files. Verify the ownership change:
ls -la /opt/ | grep kafka
Creating Log Directory
Establish a dedicated directory for Kafka data logs:
sudo -u kafka mkdir -p /opt/kafka/logs
This directory stores message data, partition logs, and metadata. Proper log directory configuration is crucial for Kafka’s performance and data persistence.
Step 6: Configure Kafka Server Properties
Understanding KRaft vs ZooKeeper Mode
Apache Kafka traditionally relied on ZooKeeper for cluster coordination and metadata management. KRaft (Kafka Raft) mode, introduced in recent versions, eliminates this dependency, simplifying architecture and improving performance. For new installations on Rocky Linux 10, KRaft mode is recommended as it represents Kafka’s future direction.
Editing server.properties for KRaft Mode
Open the server configuration file in your preferred text editor:
sudo nano /opt/kafka/config/server.properties
Configure the following essential parameters for KRaft mode. Set the process roles to enable both broker and controller functions:
process.roles=broker,controller
Assign a unique node identifier:
node.id=1
Configure the controller quorum voters:
controller.quorum.voters=1@localhost:9093
Set up network listeners for broker and controller communication:
listeners=PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
advertised.listeners=PLAINTEXT://localhost:9092
Define the listener security protocol mapping:
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
Specify the inter-broker listener name:
inter.broker.listener.name=PLAINTEXT
Set the log directory path:
log.dirs=/opt/kafka/logs
Alternative: ZooKeeper Mode Configuration
If you prefer traditional ZooKeeper-based Kafka, configure these parameters instead:
broker.id=1
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/opt/kafka/logs
zookeeper.connect=localhost:2181
Advanced Configuration Options
Optimize performance by adjusting network and I/O threads:
num.network.threads=3
num.io.threads=8
Configure data retention policies based on your requirements:
log.retention.hours=168
log.segment.bytes=1073741824
The retention period determines how long Kafka retains messages before deletion. Segment size affects disk I/O patterns and cleanup efficiency.
Formatting Log Directory for KRaft
KRaft mode requires one-time log directory formatting before first startup:
KAFKA_CLUSTER_ID="$(sudo -u kafka /opt/kafka/bin/kafka-storage.sh random-uuid)"
Format the storage using the generated cluster ID:
sudo -u kafka /opt/kafka/bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c /opt/kafka/config/server.properties
This operation initializes the metadata log and prepares the cluster for operation. Never format a running cluster, as this destroys all existing data.
Step 7: Create Systemd Service Files
Why Systemd Services are Important
Systemd integration provides professional service management capabilities. Services configured through systemd start automatically during system boot. Systemd monitors service health and can automatically restart failed processes. Centralized logging through journald simplifies troubleshooting and monitoring.
Creating ZooKeeper Service (if using ZooKeeper mode)
For ZooKeeper-based deployments, create a ZooKeeper service file:
sudo nano /etc/systemd/system/zookeeper.service
Add the following configuration:
[Unit]
Description=Apache Zookeeper Server
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
This service file ensures ZooKeeper starts before Kafka and restarts automatically on failure.
Creating Kafka Service
Create the Kafka systemd service file:
sudo nano /etc/systemd/system/kafka.service
For KRaft mode, use this configuration:
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
For ZooKeeper mode, add these directives under the [Unit] section:
Requires=zookeeper.service
After=zookeeper.service
Reloading Systemd Daemon
Inform systemd about the new service files:
sudo systemctl daemon-reload
This command parses the new service definitions and makes them available for management. Verify that systemd recognizes the services:
sudo systemctl list-unit-files | grep -E 'kafka|zookeeper'
Step 8: Start and Enable Kafka Services
Starting Services
Launch the Kafka service:
For KRaft mode:
sudo systemctl start kafka
For ZooKeeper mode, start ZooKeeper first, then Kafka:
sudo systemctl start zookeeper
sudo systemctl start kafka
The startup process typically takes 10-30 seconds depending on your system resources.
Enabling Services for Auto-Start
Configure services to start automatically during system boot:
sudo systemctl enable kafka
If using ZooKeeper:
sudo systemctl enable zookeeper
sudo systemctl enable kafka
Enabling services ensures your Kafka cluster remains available after system reboots or power cycles.
Checking Service Status
Verify that Kafka is running properly:
sudo systemctl status kafka
Look for “active (running)” in the output. The status display shows the service state, process ID, and recent log entries. If you see “failed” or “inactive,” investigate the logs for error messages.
For ZooKeeper mode, also check:
sudo systemctl status zookeeper
Viewing Service Logs
Monitor Kafka service logs in real-time:
sudo journalctl -u kafka -f
Press Ctrl+C to stop following logs. View the last 100 lines of Kafka logs:
sudo journalctl -u kafka -n 100
Successful startup logs typically include messages about socket server initialization and completion of startup sequence.
Step 9: Configure Firewall for Kafka
Understanding Required Ports
Kafka requires specific network ports for client and inter-broker communication. Port 9092 handles all Kafka client connections and data transfer. Port 9093 serves controller traffic in KRaft mode. ZooKeeper-based setups need port 2181 open for client connections.
Checking Firewalld Status
Verify that firewalld is active on your system:
sudo systemctl status firewalld
If firewalld is inactive, start and enable it:
sudo systemctl start firewalld
sudo systemctl enable firewalld
Adding Firewall Rules
Open the necessary ports for Kafka operation:
sudo firewall-cmd --permanent --add-port=9092/tcp
For KRaft mode, also open the controller port:
sudo firewall-cmd --permanent --add-port=9093/tcp
If using ZooKeeper:
sudo firewall-cmd --permanent --add-port=2181/tcp
The --permanent flag ensures rules persist across firewall restarts.
Reloading Firewall Configuration
Apply the new firewall rules:
sudo firewall-cmd --reload
Verify that ports are now open:
sudo firewall-cmd --list-ports
You should see ports 9092, and potentially 9093 or 2181, listed in the output.
Security Considerations
For production environments, restrict access to specific IP addresses or networks. Create rich rules for granular control:
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="9092" accept'
This command allows only clients from the 192.168.1.0/24 subnet to connect to Kafka. Always follow the principle of least privilege when configuring network access.
Step 10: Verify Kafka Installation
Testing Kafka Broker Connectivity
Confirm that Kafka is listening and responding to requests:
sudo /opt/kafka/bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092
This command queries the broker for supported API versions. Successful output displays a detailed list of API endpoints, confirming that Kafka is operational.
Creating a Test Topic
Topics organize messages within Kafka. Create a test topic to verify cluster functionality:
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
Partitions enable parallel processing, while replication factor determines data redundancy. For single-node deployments, use a replication factor of 1.
Listing Topics
Display all topics in your Kafka cluster:
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
The output should include “test-topic” among any other topics present.
Describing Topic Details
Examine the configuration and status of your test topic:
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --describe --topic test-topic --bootstrap-server localhost:9092
The output displays partition information, leader assignments, replicas, and in-sync replicas (ISR). Understanding this output helps you monitor cluster health and data distribution.
Checking Kafka Logs
Navigate to the Kafka log directory:
ls -lh /opt/kafka/logs/
Review the main server log file for startup messages:
tail -50 /opt/kafka/logs/server.log
Successful initialization includes messages about socket server starting, log loading completion, and the broker being in RUNNING state.
Step 11: Test Kafka with Producer and Consumer
Understanding Producer-Consumer Model
Kafka’s architecture separates message producers from consumers through topics. Producers write messages to topics without knowing who will consume them. Consumers read messages from topics independently, enabling scalable, decoupled architectures.
Starting Kafka Console Producer
Open a terminal session and launch the console producer:
sudo -u kafka /opt/kafka/bin/kafka-console-producer.sh --topic test-topic --bootstrap-server localhost:9092
The prompt changes to >, indicating the producer is ready to accept input. Type messages and press Enter after each:
>Hello Kafka on Rocky Linux 10
>This is a test message
>Streaming data works perfectly
Each line becomes a separate message in the topic. Leave this terminal open.
Starting Kafka Console Consumer
Open a second terminal session to your server. Start the console consumer:
sudo -u kafka /opt/kafka/bin/kafka-console-consumer.sh --topic test-topic --from-beginning --bootstrap-server localhost:9092
The --from-beginning flag tells the consumer to read all existing messages from the topic’s start. Messages you typed in the producer terminal appear in the consumer output.
Verifying Message Flow
With both terminals visible, type new messages in the producer terminal. These messages appear almost instantly in the consumer terminal, demonstrating real-time streaming capabilities. Message ordering within a partition is guaranteed, ensuring predictable data flow.
Testing with Key-Value Messages
Stop the current producer with Ctrl+C. Restart it with key parsing enabled:
sudo -u kafka /opt/kafka/bin/kafka-console-producer.sh --topic test-topic --bootstrap-server localhost:9092 --property parse.key=true --property key.separator=:
Send messages with keys:
user1:Login event
user2:Purchase completed
user1:Logout event
Keys affect partition assignment and enable consumer groups to process related messages together.
Stopping Producer and Consumer
Gracefully stop both processes using Ctrl+C in each terminal. This closes connections and releases resources properly.
Step 12: Basic Kafka Management Operations
Managing Topics
Modify existing topic configurations using the alter command:
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --alter --topic test-topic --partitions 3 --bootstrap-server localhost:9092
Note that you can only increase partition count, never decrease it. Reducing partitions risks data loss and breaks consumer offsets.
Change retention policies for specific topics:
sudo -u kafka /opt/kafka/bin/kafka-configs.sh --alter --entity-type topics --entity-name test-topic --add-config retention.ms=604800000 --bootstrap-server localhost:9092
This sets retention to 7 days (604800000 milliseconds).
Monitoring Kafka Performance
Consumer groups track reading progress across topic partitions. Check consumer group status:
sudo -u kafka /opt/kafka/bin/kafka-consumer-groups.sh --describe --group console-consumer-group --bootstrap-server localhost:9092
The output shows current offset, log end offset, and lag for each partition. High lag indicates consumers cannot keep pace with producers, suggesting capacity issues.
Monitor topic message counts:
sudo -u kafka /opt/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test-topic
Deleting Topics
Remove topics that are no longer needed:
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --delete --topic test-topic --bootstrap-server localhost:9092
Ensure delete.topic.enable=true is set in server.properties for deletion to work. Deleted topics cannot be recovered unless you have external backups.
Kafka Connect Introduction
Kafka Connect provides a framework for streaming data between Kafka and external systems. Configuration files reside in /opt/kafka/config/. Common use cases include database change data capture, log aggregation, and cloud storage integration.
Performance Tuning Tips
Adjust JVM heap size for better memory management. Edit the Kafka startup script or set KAFKA_HEAP_OPTS environment variable:
export KAFKA_HEAP_OPTS="-Xmx4G -Xms4G"
Optimize batch sizes in producer configurations for higher throughput. Tune network buffer sizes to match your network capacity and message patterns.
Common Troubleshooting Issues
Service Fails to Start
If Kafka won’t start, first verify Java installation:
java -version
echo $JAVA_HOME
Check file permissions on the Kafka directory:
ls -la /opt/kafka
All files should be owned by the kafka user and group. Review systemd service logs for detailed error messages:
sudo journalctl -xe -u kafka
Ensure ports 9092 and 9093 aren’t already in use:
sudo netstat -tuln | grep -E '9092|9093'
Connection Refused Errors
Verify that Kafka is listening on the correct interface:
sudo netstat -tuln | grep 9092
You should see Kafka listening on 0.0.0.0:9092 or your specific IP address. Check firewall rules:
sudo firewall-cmd --list-all
Verify the listeners configuration in server.properties matches your network setup. The advertised.listeners property must be accessible from client machines.
ZooKeeper Connection Issues
For ZooKeeper-based deployments, ensure ZooKeeper starts before Kafka:
sudo systemctl status zookeeper
Verify the zookeeper.connect property in server.properties matches your ZooKeeper configuration. Test ZooKeeper connectivity:
telnet localhost 2181
Type ruok (are you ok) and press Enter. A healthy ZooKeeper responds with imok.
Disk Space Problems
Monitor disk usage regularly:
df -h /opt/kafka/logs
Adjust log retention settings to match available storage:
log.retention.hours=24
log.retention.bytes=1073741824
Implement automatic cleanup policies by ensuring log.cleanup.policy=delete in server.properties.
Memory Issues
Monitor Java heap usage and system memory:
free -m
top -p $(pgrep -f kafka)
If Kafka consumes excessive memory, reduce heap size in KAFKA_HEAP_OPTS or kafka-server-start.sh. Consider adding more RAM for production workloads.
Security Best Practices
Authentication Configuration
Implement SASL (Simple Authentication and Security Layer) for client authentication. Configure SSL/TLS to encrypt data in transit between clients and brokers. Set up ACLs (Access Control Lists) to restrict topic access:
sudo -u kafka /opt/kafka/bin/kafka-acls.sh --add --allow-principal User:client1 --operation Read --topic test-topic --bootstrap-server localhost:9092
Network Security
Limit firewall access to trusted IP addresses:
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="10.0.0.0/8" accept'
Use private networks for inter-broker communication in multi-node clusters. Implement network segmentation to isolate Kafka traffic from public networks.
User and Permission Management
Running Kafka as a dedicated non-root user (already implemented) is essential. Set restrictive file permissions:
sudo chmod 750 /opt/kafka
sudo chmod 640 /opt/kafka/config/*.properties
Conduct regular security audits to identify potential vulnerabilities.
Data Security
Consider encryption at rest for sensitive data. Enable SSL for inter-broker communication in multi-broker clusters:
security.inter.broker.protocol=SSL
Store sensitive configuration parameters separately from application code.
Regular Updates
Keep your Kafka installation current with security patches. Monitor Apache Kafka security advisories and the Rocky Linux security list. Plan upgrade strategies that minimize downtime and data loss risk.
Congratulations! You have successfully installed Apache Kafka. Thanks for using this tutorial for installing Apache Kafka distributed streaming platform on your Rocky Linux 10 system. For additional help or useful information, we recommend you check the official Apache website.