Java – Apache Spark :-Nullpointer Exception on broadcast variables (YARN Cluster mode)

Apache Spark :-Nullpointer Exception on broadcast variables (YARN Cluster mode)… here is a solution to the problem.

Apache Spark :-Nullpointer Exception on broadcast variables (YARN Cluster mode)


have a simple spark application where I’m trying to broadcast a variable of type String on a YARN cluster.
But every time I try to access the broadcast variable value, I get null in the task. It would be very helpful if you guys could make suggestions, what I’m doing wrong here.
My code is as follows:-

public class TestApp implements Serializable { 
  static Broadcast<String[]> mongoConnectionString; 

public static void main( String[] args ) { 
    String mongoBaseURL = args[0]; 
    SparkConf sparkConf =  new SparkConf().setAppName(Constants.appName); 
    JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf); 

mongoConnectionString = javaSparkContext.broadcast(args); 

JavaSQLContext javaSQLContext = new JavaSQLContext(javaSparkContext); 

JavaSchemaRDD javaSchemaRDD = javaSQLContext.jsonFile(hdfsBaseURL+Constants.hdfsInputDirectoryPath); 

if(javaSchemaRDD!=null) { 
      pageSchemaRDD = javaSQLContext.sql(SqlConstants.getLogActionPage); 
      pageSchemaRDD.foreach(new Test());     

private static class Test implements VoidFunction<Row> { 
    private static final long serialVersionUID = 1L; 

public void call(Row t) throws Exception {"mongoConnectionString "+mongoConnectionString.value()); 


This is because your broadcast variable is class-level. Because when the class is initialized in a minion, it does not see the value you assigned in the main method. It will only see a null value because the broadcast variable is not initialized to anything. The solution I found was to pass a broadcast variable to the method when it is called. The same goes for accumulators

Related Problems and Solutions