Java – NullPointerException when attempting to serialize an Avro GenericRecord containing an array

NullPointerException when attempting to serialize an Avro GenericRecord containing an array… here is a solution to the problem.

NullPointerException when attempting to serialize an Avro GenericRecord containing an array

I’m trying to publish Avro (to Kafka) and get a NullPointerException when trying to write an Avro object using BinaryEncoder.

Here is a simplified stack trace:

java.lang.NullPointerException: null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
    at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) ~[avro-1.8.1.jar:1.8.1]
    at com.mycode.KafkaAvroPublisher.send(KafkaAvroPublisher.java:61) ~[classes/:na]
    ....
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) ~[avro-1.8.1.jar:1.8.1]
    ... 55 common frames omitted

This is the send method where the exception occurred in my code:

private static final EncoderFactory ENCODER_FACTORY = EncoderFactory.get();
private static final SpecificDatumWriter<ParentObject> PARENT_OBJECT_WRITER = new SpecificDatumWriter<>(ParentObject.SCHEMA$);
public void send(ParentObject parentObject) {
    try {
        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        binaryEncoder = ENCODER_FACTORY.binaryEncoder(stream, binaryEncoder);
        PARENT_OBJECT_WRITER.write(parentObject, binaryEncoder);   Exception HERE
        binaryEncoder.flush();
        producer.send(new ProducerRecord<>(topic, stream.toByteArray()));
    } catch (IOException ioe) {
        logger.debug("Problem publishing message to Kafka.", ioe);
    }
}

In the schema, NestedObject contains an array of DeeplyNestedObjects. I’ve done enough debugging to see that NestedObject actually contains an array of DeeplyNestedObjects, or an empty array if it doesn’t exist. This is the relevant part of the schema:

[ { "namespace": "com.mycode.avro"
  , "type": "record"
  , "name": "NestedObject"
  , "fields":
    [ { "name": "timestamp", "type": "long", "doc": "Instant in time (milliseconds since epoch)." }
    , { "name": "objs", "type": { "type": "array", "items": "DeeplyNestedObject" }, "doc": "Elided." }
    ]
  }
]

Solution

Avro’s stack trace is misleading. The problem may be deeper than the class indicated by the Exception message.

When it says “null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of

union of com.mycode.ParentObject“, it means that one of the fields is in DeeplyNestedObject Should be an array but found to be null. (It makes perfect sense to misinterpret it as DeeplyNestedObject being a null inside NestedObject.) )

You need to check the fields of DeeplyNestedObject and find out which array was not serialized correctly. The problem is most likely where the DeeplyNestedObject was created. It will have a field of type array that the serializer will not populate in all cases before calling the send method.

Related Problems and Solutions