Java – Spark forces log4j

Spark forces log4j… here is a solution to the problem.

Spark forces log4j

I have a simple spark project in Scala and want to use logback, but spark/hadoop seems to force me to use log4j.

  1. This seems inconsistent with my understanding of the purpose of slf4j; be
    Isn’t this an oversight of Spark/Hadoop?

  2. Do I have to abandon logback and use log4j?
    The workaround?

In build.sbt I tried excluding ….

"org.apache.spark" %% "spark-core" % "1.4.1" excludeAll(
    ExclusionRule(name = "log4j"),
    ExclusionRule(name = "slf4j-log4j12")
),
"org.slf4j" % "slf4j-api" % "1.7.12",
"ch.qos.logback" % "logback-core" % "1.1.3",
"ch.qos.logback" % "logback-classic" % "1.1.3"

… But this causes an exception….

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Level
    at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:354)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:344)
    at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1659)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:55)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:235)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
    at scala. Option.getOrElse(Option.scala:120)
    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2162)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:301)
    at spike. HelloSpark$.main(HelloSpark.scala:19)
    at spike. HelloSpark.main(HelloSpark.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Level
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 20 more

Solution

I ran into the same exception as you. I think you should add log4j-over-slf4j as a dependency in addition to excluding log4j and slf4j-log4j12. It worked for me.

log4j-over-slf4j is an alternative to log4j because it provides exactly the same API as log4j and actually routes all calls to log4j to slf4j, which in turn routes everything to the underlying logging framework. https://www.slf4j.org/legacy.html gives a detailed explanation.

Related Problems and Solutions