Hadoop: An array of primitives as values in key-value pairs
I asked a very similar question in a previous post Hadoop: How can i have an array of doubles as a value in a key-value pair? .
My problem is that I want to pass a double group as a value from map to the reduce stage. The answer I get is serialize, convert to text, pass it to the reducer and deserialize. This is a good solution, but it’s like serializing and deserializing twice.
ArrayWritable only accepts types that implement Writable, such as FloatWritable. So another solution is to convert my double group to a DoubleWritables array. But it also takes some time, and Writables is a very expensive resource. Isn’t there a very simple solution like ArrayWritable array=new ArrayWritable(Double.class)???
Solution
Just implement your own Writable interface.
For example,
public class DoubleArrayWritable implements Writable {
private double[] data;
public DoubleArrayWritable() {
}
public DoubleArrayWritable(double[] data) {
this.data = data;
}
public double[] getData() {
return data;
}
public void setData(double[] data) {
this.data = data;
}
public void write(DataOutput out) throws IOException {
int length = 0;
if(data != null) {
length = data.length;
}
out.writeInt(length);
for(int i = 0; i < length; i++) {
out.writeDouble(data[i]);
}
}
public void readFields(DataInput in) throws IOException {
int length = in.readInt();
data = new double[length];
for(int i = 0; i < length; i++) {
data[i] = in.readDouble();
}
}
}