Java - Job jar file is not set. User classes may not be found in Hadoop

Job jar file is not set. User classes may not be found in Hadoop… here is a solution to the problem.

Job jar file is not set. User classes may not be found in Hadoop

I’m trying to run the MR wordcount job. But I didn’t set up the job jar fileset. I’m posting a stack trace, can someone help me?

14/01/27 16:52:26 WARN mapred. JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/01/27 16:52:26 WARN mapred. JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/01/27 16:52:26 INFO input. FileInputFormat: Total input paths to process : 1
14/01/27 16:52:26 INFO util. NativeCodeLoader: Loaded the native-hadoop library
14/01/27 16:52:26 WARN snappy. LoadSnappy: Snappy native library not loaded
14/01/27 16:52:27 INFO mapred. JobClient: Running job: job_201401271610_0002
14/01/27 16:52:28 INFO mapred. JobClient:  map 0% reduce 0%
14/01/27 16:52:35 INFO mapred. JobClient: Task Id : attempt_201401271610_0002_m_000000_0,   Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

I’m running this command

hadoop jar wordcount.jar org.gamma.WordCount /user/jeet/getty/gettysburg.txt /user/jeet/getty1/out

This is my word count class

import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class WordCount {

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        StringTokenizer tokenizer = new StringTokenizer(line);
        while (tokenizer.hasMoreTokens()) {
            word.set(tokenizer.nextToken());
            context.write(word, one);
        }
    }
 } 

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context context) 
      throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        context.write(key, new IntWritable(sum));
    }
 }

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();

Job job = new Job(conf, "WordCount");

job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);
    job.setJarByClass(WordCount.class);
 }

}

Solution

You are submitting the job before telling the job which JAR class corresponds to it. Very clear in your error message:

14/01/27 16:52:26 WARN mapred. JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

Swap the last two lines of the driver and it will work:

 job.setJarByClass(WordCount.class);
 job.waitForCompletion(true);

Java – Job jar file is not set. User classes may not be found in Hadoop

Job jar file is not set. User classes may not be found in Hadoop

Solution

Related Problems and Solutions