Import error: No module named impyla
I’ve installed impyla and its dependencies this guidance. The installation seems to be successful because now I can see the folder “impyla-0.13.8-py2.7.egg” in the Anaconda folder (64-bit Anaconda version 4.1.1).
But when I import impyla in python, I get the following error:
>>> import impyla
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named impyla
I have installed 64-bit Python 2.7.12
Can anyone explain why I’m getting this error? I’m new to Python and have been spending a lot of time on different blogs but I haven’t seen much information yet. Thank you in advance for your time.
Solution
The usage is a bit different from what you mentioned (from https://github.com/cloudera/impyla).
Impyla implements the Python DB API v2.0 (PEP 249) database interface (see it for API details).
from impala.dbapi import connect
conn = connect(host='my.host.com', port=21050)
cursor = conn.cursor()
cursor.execute('SELECT * FROM mytable LIMIT 100')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
The Cursor object also exposes an iterator interface, which is buffered (controlled by cursor.arraysize):
cursor.execute('SELECT * FROM mytable LIMIT 100')
for row in cursor:
process(row)
You can also get back the pandas DataFrame object
from impala.util import as_pandas
df = as_pandas(cur)
# carry df through scikit-learn, for example