Java – Hierarchical MapReduce

Hierarchical MapReduce… here is a solution to the problem.

Hierarchical MapReduce

I wonder if it is possible to define a hierarchical MapReduce job?.
In other words, I want a map-reduce job that will call a different MapReduce job during the mapper phase. Is it possible? Do you have any suggestions for how to do this?

I want to do this in order to have a higher level of parallelism/distribution in my program.
Thank you
Arik.

Solution

Hadoop definitive guide book contains many recipes related to the MapReduce job chain. Sample code and detailed instructions are included. In particular, a section called “Advanced API Usage” or something similar.

I personally managed to replace a complex map-reduce job with several HBase tables that were used as a source with a handcrafted TableInputFormat extension. The result is an input format that combines the source data with minimal reduction, so the job is converted into a single mapper step. So I suggest you look in this direction as well.

Related Problems and Solutions