Java – You cannot use Composite textinputFormat in Map Side Join

You cannot use Composite textinputFormat in Map Side Join… here is a solution to the problem.

You cannot use Composite textinputFormat in Map Side Join

I’m trying to implement a Map-side join using CompositeTextInoutFormat. However, I encountered the following error in the Map reduce job that I can’t resolve.
1. In the code below, I get an error when using the Compose method and also when setting the inputformat class. The error is as follows.

The method compose(String, Class, Path…) in
the type CompositeInputFormat is not applicable for the arguments
(String, Class, Path[])

Who can help

package Hadoop.MR.Practice;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.join.CompositeInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.mapred.join.CompositeInputFormat;

public class MapJoinJob implements Tool{

private Configuration conf;     

public Configuration getConf() {
    return conf;
}
public void setConf(Configuration conf) {
    this.conf = conf;
}
@Override
public int run(String[] args) throws Exception {
    Job job = Job.getInstance(getConf(), "MapSideJoinJob");
    job.setJarByClass(this.getClass());

Path[] inputs = new Path[] { new Path(args[0]), new Path(args[1])};
    String join = CompositeInputFormat.compose("inner", KeyValueTextInputFormat.class, inputs);
    job.getConfiguration().set("mapreduce.join.expr", join);

job.setInputFormatClass(CompositeInputFormat.class);

job.setMapperClass(MapJoinMapper.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(LongWritable.class);

Configuring reducer
    job.setReducerClass(WCReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(LongWritable.class);
    job.setNumReduceTasks(0);

FileOutputFormat.setOutputPath(job, new Path(args[2]));

job.waitForCompletion(true);
    return 0;
}

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    MapJoinJob mjJob = new MapJoinJob();
    ToolRunner.run(conf, mjJob, args);
}

Solution

I would say your issue may be related to the hybrid hadoop API. You can see that your import mixes Mapred and MapReduce.

For example, you try to use org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat with org.apache.hadoop.mapred.join.CompositeInputFormat> This is unlikely to work.

You should choose one (I would say probably mapreduce) and make sure everything uses the same API.

Related Problems and Solutions