Spark forces log4j
I have a simple spark project in Scala and want to use logback, but spark/hadoop seems to force me to use log4j.
This seems inconsistent with my understanding of the purpose of slf4j; be
Isn’t this an oversight of Spark/Hadoop?Do I have to abandon logback and use log4j?
The workaround?
In build.sbt I tried excluding ….
"org.apache.spark" %% "spark-core" % "1.4.1" excludeAll(
ExclusionRule(name = "log4j"),
ExclusionRule(name = "slf4j-log4j12")
),
"org.slf4j" % "slf4j-api" % "1.7.12",
"ch.qos.logback" % "logback-core" % "1.1.3",
"ch.qos.logback" % "logback-classic" % "1.1.3"
… But this causes an exception….
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Level
at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:354)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1659)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.security.Groups.<init>(Groups.java:55)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:235)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
at scala. Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:301)
at spike. HelloSpark$.main(HelloSpark.scala:19)
at spike. HelloSpark.main(HelloSpark.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Level
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 20 more
Solution
I ran into the same exception as you. I think you should add log4j-over-slf4j as a dependency in addition to excluding log4j and slf4j-log4j12. It worked for me.
log4j-over-slf4j is an alternative to log4j because it provides exactly the same API as log4j and actually routes all calls to log4j to slf4j, which in turn routes everything to the underlying logging framework. https://www.slf4j.org/legacy.html gives a detailed explanation.