Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Find rows which do not have any of the given columns


+
Shrijeet Paliwal 2012-08-06, 06:42
+
jmozah 2012-08-06, 15:48
+
Shrijeet Paliwal 2012-08-06, 16:04
+
jmozah 2012-08-06, 16:25
Copy link to this message
-
Re: Find rows which do not have any of the given columns
It seems setting time range is a problem , I was doing  (*
scan.setTimeRange(Long.**valueOf(args[4]), Long.valueOf(args[5]));)*
*
*
I was working on assumption that filter logic works before scan logic, in
other words a KV dropped by filter will not make it to scan. In case of
time range this might not be true.

-Shrijeet
On Mon, Aug 6, 2012 at 9:25 AM, jmozah <[EMAIL PROTECTED]> wrote:

> Hmmm.. Missed it. Otherwise i dont spot anything wrong in this.
> are you sure about the column names?
>
> ./zahoor
>
>
> On 06-Aug-2012, at 9:34 PM, Shrijeet Paliwal <[EMAIL PROTECTED]>
> wrote:
>
> > I am using FilterList. Could you elaborate?
> >
> > On Mon, Aug 6, 2012 at 8:48 AM, jmozah <[EMAIL PROTECTED]> wrote:
> >
> >>
> >>
> >> Use FilterList instead of List of Filters.
> >>
> >> ./Zahoor
> >>
> >> On 06-Aug-2012, at 12:12 PM, Shrijeet Paliwal <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> I am writing a job which finds rows that do not have a cell
> corresponding
> >>> to any of the columns in the given set of columns.
> >>> This is how I have configured my scan (a combination of
> lQualifierFilters
> >>> and SkipFilter)
> >>>
> >>>   columnsSet = Splitter.on(',') .split(columns); //columns is a csv
> >>> containing column names
> >>>   List<Filter> qualifierFilters = new ArrayList<Filter>();
> >>>   for (String qual : columnsSet) {
> >>>     qualifierFilters.add(new QualifierFilter(CompareOp.NOT_EQUAL,
> >>>         new BinaryComparator(Bytes.toBytes(qual))));
> >>>   }
> >>>   Filter skipFilter = new SkipFilter(new
> >>> FilterList(Operator.MUST_PASS_ALL, qualifierFilters));
> >>>   Scan scan = new Scan();
> >>>   scan.addFamily(Bytes.toBytes(family));
> >>>   scan.setCacheBlocks(false);
> >>>   scan.setCaching(1000);
> >>>   scan.setFilter(skipFilter);
> >>>   scan.setTimeRange(Long.valueOf(args[4]), Long.valueOf(args[5]));
> >>>
> >>> In my test table the scan worked as expected. But in production run, I
> >> got
> >>> rows which had cells containing one of the given qualifiers (not
> >> expected)
> >>> Can some one help me spot the mistake?
> >>>
> >>> -Shrijeet
> >>
> >>
>
>