Java – The Hadoop API throws an error when trying to initialize the cluster to use DistCp

The Hadoop API throws an error when trying to initialize the cluster to use DistCp… here is a solution to the problem.

The Hadoop API throws an error when trying to initialize the cluster to use DistCp

I’m trying to do distributed replication via DistCp classes using the Hadoop API, but I throw an error when trying to connect to the cluster. I’ve tried changing the config files for hadoop and hdfs, but it doesn’t seem to work. I’m testing the app on the latest Cloudera Quickstart

I run this command to execute the class.
java -cp myjar com.keedio.hadoop.Mover

package com.keedio.hadoop;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.tools.DistCp;

import org.apache.hadoop.tools.DistCpOptionSwitch;
import org.apache.hadoop.tools.DistCpOptions;
import org.apache.hadoop.util.ToolRunner;

import java.util.Collections;

public class Mover {

public static void main(String args []) {

try {
  FileSystem fs = FileSystem.getLocal(new Configuration());
  FileSystem fs2= FileSystem.get(java.net.URI.create("file:///"),new 
  Configuration());
  DistCpOptions distCpOptions=new DistCpOptions(new Path("file:/Users/jvasquez/Desktop/ficheros1"),new Path("file:/Users/jvasquez/Desktop/ficheros2"));
String argumentos [] = {"file:////Users/jvasquez/Desktop/ficheros1","file:///Users/jvasquez/Desktop/ficheros2"};

Configuration conf=new Configuration();

conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/yarn-site.xml"));
conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/core-site.xml"));
conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/mapred-site.xml"));
conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));

DistCp distCp= new DistCp(conf,distCpOptions);

ToolRunner.run(distCp,argumentos);

}
 catch (Exception e) {
 e.printStackTrace();

}

}

}

This is a bug

log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:143)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:108)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:101)
at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:471)
at org.apache.hadoop.tools.DistCp.<init>(DistCp.java:107)
at com.keedio.hadoop.Mover.main(Mover.java:48)

Solution

Finally I solved it and the result was that I just needed to use the command hadoop jar to launch the app as a mapreduce app.

Related Problems and Solutions