Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Find rows which do not have any of the given columns


Copy link to this message
-
Find rows which do not have any of the given columns
Hi All,

I am writing a job which finds rows that do not have a cell corresponding
to any of the columns in the given set of columns.
This is how I have configured my scan (a combination of lQualifierFilters
and SkipFilter)

    columnsSet = Splitter.on(',') .split(columns); //columns is a csv
containing column names
    List<Filter> qualifierFilters = new ArrayList<Filter>();
    for (String qual : columnsSet) {
      qualifierFilters.add(new QualifierFilter(CompareOp.NOT_EQUAL,
          new BinaryComparator(Bytes.toBytes(qual))));
    }
    Filter skipFilter = new SkipFilter(new
FilterList(Operator.MUST_PASS_ALL, qualifierFilters));
    Scan scan = new Scan();
    scan.addFamily(Bytes.toBytes(family));
    scan.setCacheBlocks(false);
    scan.setCaching(1000);
    scan.setFilter(skipFilter);
    scan.setTimeRange(Long.valueOf(args[4]), Long.valueOf(args[5]));

In my test table the scan worked as expected. But in production run, I got
rows which had cells containing one of the given qualifiers (not expected)
Can some one help me spot the mistake?

-Shrijeet
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB