Java – Find friends for all users# : How to implement with Hadoop Mapreduce?

Find friends for all users# : How to implement with Hadoop Mapreduce?… here is a solution to the problem.

Find friends for all users# : How to implement with Hadoop Mapreduce?

Let’s say I have the following input:

(1,2)(2,1)(1,3)(3,2)(2,4)(4,1) 

The expected output is as follows:

(1,(2,3,4)) -> (1,3) //second index is total friend #
(2,(1,3,4)) -> (2,3)
(3,(1,2))   -> (3,2)
(4,(1,2))   -> (4,2)

I know how to do this using hash sets in Java. But not sure how this works with the mapreduce model. Can anyone come up with any ideas or sample code on this issue? I would appreciate it.

——————————————– ————————————–

This is my naïve solution: 1 mapper, two reducers.
The mapper organizes inputs (1,2), (2,1), (1,3);

Organize the output into

*(1,hashset<2>),(2,hashSet<1>),(1,hashset<2>),(2,hashset<1>),(1,hashset<3>),(3,hashset<1>). *

Reducer1:

Take the output of Mapper as input and the output is:

*(1,hashset<2,3>), (3,hashset<1>)and (2,hashset<1>)< em>*

Reducer2:

Take the output of

reducer1 as the input and the output is:

*(1,2), (3,1), and (2,1)*

It’s just my naïve solution. I’m not sure if this can be done with Hadoop’s code.

Solution

I think there should be an easy way to solve this problem.

Mapper Input: (1,2)(2,1)(1,3)(3,2)(2,4)(4,1)

Issue two records for each pair like this:

Mapper Output/ Reducer Input:

Key => Value
1 => 2
2 => 1
2 => 1
1 => 2
1 => 3
3 => 1
3 => 2
2 => 3
2 => 4
4 => 2
4 => 1
1 => 1

On the reducer side, you get 4 different groups as follows:

Reducer Output:

Key => Values
1 => [2,3,4]
2 => [1,3,4]
3 => [1,2]
4 => [1,2]

Now you can format the result as you like. 🙂
Let me know if anyone can see any issues in this approach

Related Problems and Solutions