Pass the Python script used for mapreduce to HBase… here is a solution to the problem.
Pass the Python script used for mapreduce to HBase
We have a Hadoop-based HBase implementation. So far, all of our Map-Reduce jobs have been written as Java classes. I was wondering if there was a good way to map to HBase using a Python script.
Solution
There is a good open-source library that can be used for this purpose. It’s called HappyBase
and here.
Here is an example of some simple HBase operations done with HappyBase:
import happybase
connection = happybase. Connection('localhost')
table = connection.table('my-table')
table.put('row-key', {'family:qual1': 'value1','family:qual2': 'value2'})
row = table.row('row-key')print row['family:qual1']
# prints 'value1'
for key, data in table.rows(['row-key-1', 'row-key-2']):
print key, data # prints row key and data for each row
for key, data in table.scan(row_prefix='row'):
print key, data # prints 'value1' and 'value2'
row = table.delete('row-key')
So, if you want to run a Map/Reduce job in Python to access HBase, here’s what you can do:
- Install HappyBase on all data nodes.
- Detail running your job on a cluster using Python streaming in the streaming section.