Java – How do I compile Hadoop for a 64-bit Linux machine?

How do I compile Hadoop for a 64-bit Linux machine?… here is a solution to the problem.

How do I compile Hadoop for a 64-bit Linux machine?

I have downloaded the latest stable binaries for Hadoop (2.2.0). Just as I was initializing HDFS, I got this warning:

WARN util. NativeCodeLoader: Unable to load native-hadoop library for
your platform… using builtin-java classes where applicable

I

knew I could fix this by compiling from source, so I downloaded the source package from Hadoop. I know the basic process of compiling but get confused after reading the README. A quick google show I have to use maven for this, a tool for building java-based projects.

So my question is, how do I compile Hadoop from source using Maven? Should I go into each directory and compile each module? The step-by-step guide will be very helpful and we would appreciate it.

Solution

After extracting the source code, you will find a Super Pom in the following location.
\hadoop-2.2.0-src.tar\hadoop-2.2.0-src\hadoop-2.2.0-src\pom.xml
This will build all modules.
You can build: mvn clean install using the command

You should notice the following log.

            [INFO] ------------------------------------------------------------------------
            [INFO] Reactor Build Order:
            [INFO]
            [INFO] Apache Hadoop Main
            [INFO] Apache Hadoop Project POM
            [INFO] Apache Hadoop Annotations
            [INFO] Apache Hadoop Project Dist POM
            [INFO] Apache Hadoop Assemblies
            [INFO] Apache Hadoop Maven Plugins
            [INFO] Apache Hadoop Auth
            [INFO] Apache Hadoop Auth Examples
            [INFO] Apache Hadoop Common
            [INFO] Apache Hadoop NFS
            [INFO] Apache Hadoop Common Project
            [INFO] Apache Hadoop HDFS
            [INFO] Apache Hadoop HttpFS
            [INFO] Apache Hadoop HDFS BookKeeper Journal
            [INFO] Apache Hadoop HDFS-NFS
            [INFO] Apache Hadoop HDFS Project
            [INFO] hadoop-yarn
            [INFO] hadoop-yarn-api
            [INFO] hadoop-yarn-common
            [INFO] hadoop-yarn-server
            [INFO] hadoop-yarn-server-common
            [INFO] hadoop-yarn-server-nodemanager
            [INFO] hadoop-yarn-server-web-proxy
            [INFO] hadoop-yarn-server-resourcemanager
            [INFO] hadoop-yarn-server-tests
            [INFO] hadoop-yarn-client
            [INFO] hadoop-yarn-applications
            [INFO] hadoop-yarn-applications-distributedshell
            [INFO] hadoop-mapreduce-client
            [INFO] hadoop-mapreduce-client-core
            [INFO] hadoop-yarn-applications-unmanaged-am-launcher
            [INFO] hadoop-yarn-site
            [INFO] hadoop-yarn-project
            [INFO] hadoop-mapreduce-client-common
            [INFO] hadoop-mapreduce-client-shuffle
            [INFO] hadoop-mapreduce-client-app
            [INFO] hadoop-mapreduce-client-hs
            [INFO] hadoop-mapreduce-client-jobclient
            [INFO] hadoop-mapreduce-client-hs-plugins
            [INFO] Apache Hadoop MapReduce Examples
            [INFO] hadoop-mapreduce

And there’s more….

However, if you just want to use Hadoop, it’s a long process.
You should be able to use an existing library.
There could be some configuration issues.

The other option is Cloudera. I have installed it on RedHat Linux.

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Quick-Start/cdh4qs_topic_3.html

Good luck.

Related Problems and Solutions