An JAVA_HOME error occurred when upgrading to Spark 1.3.0
I’m trying to upgrade a Spark project written in Scala from Spark 1.2.1 to 1.3.0, so I changed my build.sbt
as follows:
-libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.1" % "provided"
+libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" % "provided"
Then make an assembly
jar and submit:
HADOOP_CONF_DIR=/etc/hadoop/conf \
spark-submit \
--driver-class-path=/etc/hbase/conf \
--conf spark.hadoop.validateOutputSpecs=false \
--conf spark.yarn.jar=hdfs:/apps/local/spark-assembly-1.3.0-hadoop2.4.0.jar \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--deploy-mode=cluster \
--master=yarn \
--class=TestObject \
--num-executors=54 \
target/scala-2.11/myapp-assembly-1.2.jar
Job submission fails, and the terminal receives the following exception:
15/03/19 10:30:07 INFO yarn. Client:
15/03/19 10:20:03 INFO yarn. Client:
client token: N/A
diagnostics: Application application_1420225286501_4698 failed 2 times due to AM
Container for appattempt_1420225286501_4698_000002 exited with exitCode: 127
due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Finally, I
went to check the web interface of the YARN app master (since the work is there, I know it does at least that) and the only logs it shows are these:
Log Type: stderr
Log Length: 61
/bin/bash: {{JAVA_HOME}}/bin/java: No such file or directory
Log Type: stdout
Log Length: 0
I’m not sure how to interpret it – {{JAVA_HOME}}
is a text (including parentheses) that somehow turns it into a script? Is this from a worker node or a driver? What can I do to experiment and troubleshoot?
I did set JAVA_HOME
:in the hadoop config file for all nodes in the cluster
% grep JAVA_HOME /etc/hadoop/conf/*.sh
/etc/hadoop/conf/hadoop-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31
/etc/hadoop/conf/yarn-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31
Has this behavior changed in 1.3.0 since 1.2.1? With 1.2.1 and without any other changes, the job completes normally.
[Note: I originally posted this on the Spark mailing list, if/when I find a solution, I’ll update both places.] ]
Solution
Have you tried setting JAVA_HOME in the etc/hadoop/yarn-env.sh file? Your JAVA_HOME environment variable might not work with the YARN container that runs the job.
I’ve had it happen before that some environment variable in .bashrc on the node is not being read by the yarn worker spawned on the cluster.
The error may not be related to the version upgrade, but to the YARN environment configuration.