Java – Scalding jobs on EMR version 4.2.0 failed due to VerifyError

Scalding jobs on EMR version 4.2.0 failed due to VerifyError… here is a solution to the problem.

Scalding jobs on EMR version 4.2.0 failed due to VerifyError

We have a Scalding job and I want to run it on AWS Elastic MapReduce using the version label 4.2.0.

This job runs successfully on AMI 2.4.2. When we upgraded it to AMI 3.7.0, we encountered java.lang.VerifyError caused by an incompatible jar. Our project uses version 1.5 of the commons-codec library, but earlier incompatible versions were shipped with the AMI. Again, our project uses Scala 2.10, but AMI comes with version 2.11. We remove all matching commons-codec-1.[ 234] .jar or scala-library-2.11.*.jar file to solve this problem cluster.

Now we’re upgrading to 4.2.0 again and getting a VerifyError:

```
Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.twitter.scalding.Job$.apply(Job.scala:47)
    at com.twitter.scalding.Tool.getJob(Tool.scala:48)
    at com.twitter.scalding.Tool.run(Tool.scala:68)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner$.main(JobRunner.scala:33)
    at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner.main(JobRunner.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
  Location:
    com/snowplowanalytics/snowplow/enrich/common/utils/ConversionUtils$.decodeBase64Url(Ljava/lang/String; Ljava/lang/String;)Lscalaz/Validation; @5: invokevirtual
  Reason:
    Type 'org/apache/commons/codec/binary/Base64' (current frame, stack[0]) is not assignable to 'org/apache/commons/codec/binary/BaseNCodec'
  Current Frame:
    bci: @5
    flags: { }
    locals: { 'com/snowplowanalytics/snowplow/enrich/common/utils/ConversionUtils$', 'java/lang/String', 'java/lang/String' }
    stack: { 'org/apache/commons/codec/binary/Base64', 'java/lang/String' }
  Bytecode:
    0000000: 2ab7 008a 2cb6 0090 3a04 bb00 5459 1904
    0000010: b200 96b7 0099 3a05 b200 9e19 05b9 00a4
    0000020: 0200 b900 aa01 00a7 003e 4eb2 009e bb00
    0000030: ac59 b200 4112 aeb6 00b1 b700 b4b2 0041
    0000040: 06bd 0004 5903 2b53 5904 2c53 5905 2db6
    0000050: 00b9 53b6 00bf b900 c502 00b9 00a4 0200
    0000060: b900 c801 00b0                         
  Exception Handler Table:
    bci [0, 42] => handler: 42
  Stackmap Table:
    same_locals_1_stack_item_frame(@42,Object[#182])
    same_locals_1_stack_item_frame(@101,Object[#206])

at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJobConfig$.com$snowplowanalytics$snowplow$enrich$hadoop$EtlJobConfig$$base64ToJsonNode(EtlJobConfig.scala:224)
    at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJobConfig$.loadConfigAndFilesToCache(EtlJobConfig.scala:126)
    at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob. (EtlJob.scala:139)
    ... 16 more
```

Exploring which jars remain on the cluster following the purge:

$ sudo find / -name "*scala-*"
/usr/share/aws/emr/emrfs/cli/lib/scala-library-2.10.5.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-reflect-2.10.4.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-logging-api_2.10-2.1.2.jar
/usr/share/aws/emr/emrfs/cli/lib/nscala-time_2.10-1.2.0.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-logging-slf4j_2.10-2.1.2.jar
$ sudo find / -name "*commons-codec*"
/usr/share/aws/emr/node-provisioner/lib/commons-codec-1.9.jar
/usr/share/aws/emr/emr-metrics/lib/commons-codec-1.6.jar
/usr/share/aws/emr/emr-metrics-client/lib/commons-codec-1.6.jar
/usr/share/aws/emr/emrfs/lib/commons-codec-1.9.jar
/usr/share/aws/emr/hadoop-state-pusher/lib/commons-codec-1.8.jar
/usr/lib/hbase/lib/commons-codec-1.7.jar
/usr/lib/mahout/lib/commons-codec-1.7.jar

AMI 4.1.0 also gives the same error. What happened between 3.7.0 and 4.x.x that caused this issue and how do I fix it?

Solution

Finally, I added the following logic to the bootstrapping step:

wget 'http://central.maven.org/maven2/commons-codec/commons-codec/1.5/commons-codec-1.5.jar'
sudo mkdir -p /usr/lib/hadoop/lib
sudo cp commons-codec-1.5.jar /usr/lib/hadoop/lib/remedial-commons-codec-1.5.jar
rm commons-codec-1.5.jar

This will download the correct version of the jar from Maven and place it at the beginning of the classpath of the failed job step, which takes precedence over other versions of the jar.

Is there a cleaner solution?

Related Problems and Solutions