How To Install Apache Hadoop on CentOS 8

Install Apache Hadoop on CentOS 8

In this tutorial we will show you how to install Apache Hadoop on CentOS 8. For those of you who didn’t know, Apache Hadoop is an open source framework used for distributed storage as well as distributed processing of big data on clusters of computers which runs on commodity hardwares. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

This article assumes you have at least basic knowledge of Linux, know how to use the shell, and most importantly, you host your site on your own VPS. The installation is quite simple and assumes you are running in the root account, if not you may need to add ‘sudo’ to the commands to get root privileges. I will show you through the step by step installation Apache Hadoop on a CentOS 8 server.

Install Apache Hadoop on CentOS 8

Step 1. First let’s start by ensuring your system is up-to-date.

Step 2. Installing Java.

Apache Hadoop is written in Java and supports only Java version 8. You can install OpenJDK 8 using following command:

Check the Java version:

Step 3. Installing Apache Hadoop CentOS 8.

It is recommended to create a normal user to configure Apache Hadoop, create a user using following command:

Next, we will need to configure passwordless SSH authentication for the local system:

Verify the password-less ssh configuration with the command:

Next steps, download the latest stable version of Apache Hadoop, At the moment of writing this article it is version 3.2.1:

Then, you will need to configure Hadoop and Java Environment Variables on your system:

Now we activate the environment variables with the following command:

Next, open the Hadoop environment variable file:

Hadoop has many of configuration files, which need to configure as per requirements of your hadoop infrastructure. Lets start with the configuration with basic hadoop single node cluster setup:

Edit core-site.xml:

Create the namenode and datanode directories under hadoop user home /home/hadoop directory:

Edit hdfs-site.xml:

Edit mapred-site.xml:

Edit yarn-site.xml:

Now format namenode using following command, do not forget to check the storage directory:

Start both NameNode and DataNode daemons by using the scripts provided by Hadoop:

Step 4. Configure Firewall.

Run the following command to allow Apache Hadoop connections through the firewall:

Step 5. Accessing Apache Hadoop.

Apache Hadoop will be available on HTTP port 9870 and port 50070 by default. Open your favorite browser and navigate to http://your-domain.com:9870 or http://your-server-ip:9870.

Install Apache Hadoop on CentOS 8

Congratulation’s! You have successfully installed Apache Hadoop. Thanks for using this tutorial for installing Hadoop on CentOS 8 system. For additional help or useful information, we recommend you to check the official Apache Hadoop website.

VPS Manage Service Offer
If you don’t have time to do all of this stuff, or if this is not your area of expertise, we offer a service to do “VPS Manage Service Offer”, starting from $10 (Paypal payment). Please contact us to get a best deal!