Java – Spark SQLContext could not find the hive table

Spark SQLContext could not find the hive table… here is a solution to the problem.

Spark SQLContext could not find the hive table

I tried querying the Hive table with a simple Spark job (written in Java).

SparkConf conf = new SparkConf().setMaster("local[*]").setAppName("MyJob");

JavaSparkContext sc = new JavaSparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

DataFrame df = sqlContext.table("scf");

But when I submit the jar via spark-submit, I get the following error:

Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    at org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:108)
    at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831)
    at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827)
    at MyJob.myJob(MyJob.java:30)
    at MyJob.main(MyJob.java:65)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I’m sure the table exists. If I run sqlContext.table(“scf”).count in spark-shell, it gives the result.

What could be the problem?

Thanks!

Solution

SQLContex does not support Hive. You must use HiveContext or SparkSession that supports Hive.

import org.apache.spark.sql.hive.HiveContext

val sqlContext = new HiveContext(sc)

Related Problems and Solutions