Java – Executes ValueFilter and Count values on the hbase shell

Executes ValueFilter and Count values on the hbase shell… here is a solution to the problem.

Executes ValueFilter and Count values on the hbase shell

I’m using HBase Shell and wondering if I can calculate the values filtered by the following scan commands?

scan 'table', { COLUMNS => 'cf:c', FILTER => "ValueFilter( =, 'substring:myvalue' )" }

It should show the sum on the shell. Any ideas?

Thanks for your help.

Solution

The count command does not support filters. Only scanning can.

AFAIK + counts in hbase shell filters is not possible.

You can perform the following actions on a small number of rows.

For small data:

So I suggest you have to do some of these things with the hbase java client

scan with your value filter here ....

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    count++;
}

For big data (for speed and parallelism, we need to use Mapreduce or some other distributed thing here…):

I recommend using the mapreduce program to count the number of rows.
In the driver scan object, you need to set your value filter, as shown in the following example.

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.filter.*;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class SimpleRowCounter extends Configured implements Tool {

static class RowCounterMapper extends TableMapper<ImmutableBytesWritable, Result> {
    public static enum Counters { ROWS }

@Override
    public void map(ImmutableBytesWritable row, Result value, Context context) {
      context.getCounter(Counters.ROWS).increment(1);
    }
  }

@Override
  public int run(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println("Usage: SimpleRowCounter <tablename>");
      return -1;
    }
    String tableName = args[0];
    Scan scan = new Scan();

Filter

valFilter = new ValueFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL,
the new BinaryComparator(Bytes.toBytes("1500"));
scan.setFilter(valFilter);

    Job job = new Job(getConf(), getClass().getSimpleName());
    job.setJarByClass(getClass());
    TableMapReduceUtil.initTableMapperJob(tableName, scan,
        RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job);
    job.setNumReduceTasks(0);
    job.setOutputFormatClass(NullOutputFormat.class);
    return job.waitForCompletion(true) ? 0 : 1;
  }

public static void main(String[] args) throws Exception {
    int exitCode = ToolRunner.run(HBaseConfiguration.create(),
        new SimpleRowCounter(), args);
    System.exit(exitCode);
  }
}

Related Problems and Solutions