Java – The original comparator and WritableComparable

The original comparator and WritableComparable… here is a solution to the problem.

The original comparator and WritableComparable

compare() and compareTo() are synonymous if we talk about sort keys, but I just wonder if there is a need to consider information about when to use compare() and when to use compareTo() in the age of highly configurable machines?

If you need to consider compare(byte b1[],int s1,int l1,

byte b2[],int s2,int l2) than compareTo(object key1,Object key2) then please suggest which field or use case or problem type we really need to decide which one to use?

Thank you!!

Solution

Use of RawComparator:

If you still want to optimize the time spent on Map Reduce Jobs, you must use RawComparator.

The intermediate key-value pair has been passed from Mapper to the reducer. Before these values reach the Reducer from Mapper, shuffle and sort steps are performed.

Sorting is improved because RawComparator compares keys by byte. If we don’t use RawComparator, we have to fully deserialize the intermediate keys to perform the comparison.

Example:

public class IndexPairComparator extends WritableComparator {
protected IndexPairComparator() {
    super(IndexPair.class);
}

@Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
    int i1 = readInt(b1, s1);
    int i2 = readInt(b2, s2);

int comp = (i1 < i2) ? -1 : (i1 == i2) ? 0 : 1;
    if(0 != comp)
        return comp;

int j1 = readInt(b1, s1+4);
    int j2 = readInt(b2, s2+4);
    comp = (j1 < j2) ? -1 : (j1 == j2) ? 0 : 1;

return comp;
}

In the above example, we did not directly implement RawComparator. Instead, we extended WritableComparator, which implements RawComparator internally.

Take a look at this Article written by Jee Vang

Implementation

of RawComparator() in WritableComparator: only key values are compared

public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
     try {
      buffer.reset(b1, s1, l1);                    parse key1
      key1.readFields(buffer);

buffer.reset(b2, s2, l2);                    parse key2
      key2.readFields(buffer);

} catch (IOException e) {
      throw new RuntimeException(e);
    }

return compare(key1, key2);                    compare them
}

Look at source

Related Problems and Solutions