NullPointerException when attempting to serialize an Avro GenericRecord containing an array
I’m trying to publish Avro (to Kafka) and get a NullPointerException
when trying to write an Avro object using BinaryEncoder
.
Here is a simplified stack trace:
java.lang.NullPointerException: null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) ~[avro-1.8.1.jar:1.8.1]
at com.mycode.KafkaAvroPublisher.send(KafkaAvroPublisher.java:61) ~[classes/:na]
....
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) ~[avro-1.8.1.jar:1.8.1]
... 55 common frames omitted
This is the send method where the exception occurred in my code:
private static final EncoderFactory ENCODER_FACTORY = EncoderFactory.get();
private static final SpecificDatumWriter<ParentObject> PARENT_OBJECT_WRITER = new SpecificDatumWriter<>(ParentObject.SCHEMA$);
public void send(ParentObject parentObject) {
try {
ByteArrayOutputStream stream = new ByteArrayOutputStream();
binaryEncoder = ENCODER_FACTORY.binaryEncoder(stream, binaryEncoder);
PARENT_OBJECT_WRITER.write(parentObject, binaryEncoder); Exception HERE
binaryEncoder.flush();
producer.send(new ProducerRecord<>(topic, stream.toByteArray()));
} catch (IOException ioe) {
logger.debug("Problem publishing message to Kafka.", ioe);
}
}
In the schema, NestedObject
contains an array of DeeplyNestedObjects
. I’ve done enough debugging to see that NestedObject
actually contains an array of DeeplyNestedObjects
, or an empty array if it doesn’t exist. This is the relevant part of the schema:
[ { "namespace": "com.mycode.avro"
, "type": "record"
, "name": "NestedObject"
, "fields":
[ { "name": "timestamp", "type": "long", "doc": "Instant in time (milliseconds since epoch)." }
, { "name": "objs", "type": { "type": "array", "items": "DeeplyNestedObject" }, "doc": "Elided." }
]
}
]
Solution
Avro’s stack trace is misleading. The problem may be deeper than the class indicated by the Exception
message.
When it says “null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of
union of com.mycode.ParentObject
“, it means that one of the fields is in DeeplyNestedObject
Should be an array
but found to be null
. (It makes perfect sense to misinterpret it as DeeplyNestedObject
being a null
inside NestedObject
.) )
You need to check the fields of DeeplyNestedObject
and find out which array
was not serialized correctly. The problem is most likely where the DeeplyNestedObject
was created. It will have a field of type array
that the serializer will not populate in all cases before calling the send method.