Java – Do Mappers and Reducers in Hadoop have to be static classes?

Do Mappers and Reducers in Hadoop have to be static classes?… here is a solution to the problem.

Do Mappers and Reducers in Hadoop have to be static classes?

I tried to do something simple in Hadoop and found that when writing mappers and reducers, everywhere is defined as static. My task will be broken down into several map parts and a final reduce. If I define the mapper class as internal static, can I use it in other work? Also, important issues may require more and more complex mappers, so it can be bad to put them all in one huge file when maintaining.

Is there any way to get the mapper and reducer to be regular classes (maybe even in a separate jar) instead of the job itself?

Solution

Your question is whether the class has to be static, can it be static, can it be internal, or should it be internal?

Hadoop itself needs to be able to instantiate your Mapper or Reducer through reflection, given the class reference/name configured in your job. If it is a non-static inner class, this will fail because an instance can only be created in the context of some of your other classes that may not be known to Hadoop. (I guess unless the inner class extends its closed class.) )

So answer the first question: it shouldn’t be non-static, as that would almost certainly make it unusable. Answer the second and third: it can be a static (internal) class.

To me, Mapper or Reducer is clearly a top-level concept that deserves a top-level class. Some people like to have them internally static to pair them with the “runner” class. I don’t like this because it’s really what the subpacks are for. You notice another design reason to avoid this. For the fourth question: No, I don’t think inner classes are good practice.

One final question: Yes, the Mapper and Reducer classes can be in separate JAR files. You tell Hadoop which JAR files contain all this code, and that’s what it’s going to send to the workers. worker doesn’t need your work. However, they need anything that Mapper and Reducer depend on in their same JAR.

Related Problems and Solutions