Java – How do I get the Hive JDBC driver and Hive shell to communicate with the same database?

How do I get the Hive JDBC driver and Hive shell to communicate with the same database?… here is a solution to the problem.

How do I get the Hive JDBC driver and Hive shell to communicate with the same database?

I have a hive server running on the default port 10000 and started as: hive --service hiveserver
Then I have java programs (tutorial!) Use Hive JDBC Client Connect to it using:

Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/default", "", "");

The tutorial runs and creates a table testhivedrivertable on the default database and describes it. This works fine, my hive service logs a bunch of stuff.

Then I tried calling the shell to the same database via hive -p 10000, which got me a shell but I couldn’t see the tables created by the java program (nor the java program could see the tables I created when I was in the shell). Also, when I run the command in the hive shell, nothing shows up in the console, so I’m pretty sure I’m talking to a different hive instance.

How can I get the hive shell to interact with the same database as the java JDBC driver?!

Solution

You are talking to the same Hive instance; Unfortunately, not the same Metastore.

The Hive

metastore is a database that stores metadata about Hive tables (for example, table name, column name and type, table location, storage handler in use, number of buckets in the table, sort column (if any), partition column (if any), etc. )。 When you create a table, this metastore updates information related to the new table, which is queried when you issue a query against the table.

However, one of the important considerations for Hive’s founders was to make it easy to use out of the box. This prompted them to decide to default to the embedded Derby database as the metastore. This does not require setup, but the side effect is that the scope of the database is in a single CLI call or a single JDBC client context. As a result, Hive metadata is not persisted across multiple calls or across clients. This is what you see.

You should migrate to using a standalone Metastore, which will hold data across multiple Hive clients. MySQL and PostGres are popular choices. Cloudera has a good article about configuring Hive and MySQL to use the MySQL metastore. Available here .

Related Problems and Solutions