Python – Pass the Python script used for mapreduce to HBase

Pass the Python script used for mapreduce to HBase… here is a solution to the problem.

Pass the Python script used for mapreduce to HBase

We have a Hadoop-based HBase implementation. So far, all of our Map-Reduce jobs have been written as Java classes. I was wondering if there was a good way to map to HBase using a Python script.

Solution

There is a good open-source library that can be used for this purpose. It’s called HappyBase and here.

Here is an example of some simple HBase operations done with HappyBase:

import happybase

connection = happybase. Connection('localhost')
table = connection.table('my-table')

table.put('row-key', {'family:qual1': 'value1','family:qual2': 'value2'})
row = table.row('row-key')print row['family:qual1']  
# prints 'value1'
for key, data in table.rows(['row-key-1', 'row-key-2']):    
    print key, data  # prints row key and data for each row
for key, data in table.scan(row_prefix='row'):    
    print key, data  # prints 'value1' and 'value2'
row = table.delete('row-key')

So, if you want to run a Map/Reduce job in Python to access HBase, here’s what you can do:

  1. Install HappyBase on all data nodes.
  2. Detail running your job on a cluster using Python streaming in the streaming section.

Related Problems and Solutions