Python – Connect to the Hive remote server from Python

Connect to the Hive remote server from Python… here is a solution to the problem.

Connect to the Hive remote server from Python

I can’t connect to a hive running on a remote server through my python script.
I’m using the same script (with different server details, of course) to connect to a hive running on my localhost and was able to connect.

I’m using the command to start the server from the command line on localhost:

hive —service hiveserver2

Start the server and run the Python script

Connect to the script for Hive running on localhost:

import pyhs2    
conn = pyhs2.connect(host='localhost', port=10000, authMechanism='PLAIN', user='hive', password ='', database='default')
    with conn.cursor() as cur:
            cur.execute("show databases")
            for i in cur.fetch():
                print i  

Using the code above, I was able to access the db@Hive on localhost.

I’m connecting to a remote server using the code below, here I’m not doing anything on the command line to start the remote server.

Connect to a script for Hive running on a remote server:

conn = pyhs2.connect(host='<my remote server Ip>', port=<port no>, authMechanism='PLAIN', user='<usernameToConnectToRemoteServer>', password ="< remoteServerPassword>"  database='default')
with conn.cursor() as cur:
        cur.execute("show databases")
        for i in cur.fetch():
            print i

This returns a message:

thrift.transport.TTransport.TTransportException: TSocket read 0 bytes.

I’ve tried to google and find a solution as best I can, but all I’m seeing is an example of connecting to localhost. Please help me with this.

Solution

Try SSH to your remote machine and then connect to hive- like below

import paramiko
import traceback

def hive_query_executor():
dns_name = ''
conn_obj = paramiko. SSHClient()
conn_obj.set_missing_host_key_policy(paramiko. AutoAddPolicy())
try:
    conn_obj.connect(dns_name, username="hadoop",
                     key_filename='')# or password

Hive_query="select * from abc limit 10; "
    query_execute_command = 'ihive -e "' + impala_query + '"'
    std_in, std_out, std_err = conn_obj.exec_command(query_execute_command)

conn_obj.close()

except:
    print "Error :" + str(traceback.format_exc())
    exit(0)

hive_query_executor()

Related Problems and Solutions