|
|
-
Retrieving rows with specific values using SinglieColumnValueFilter
Kumar, Suresh 2012-10-15, 23:00
I have a HBase with some apache logs loaded.
I am trying to retrieve a section of logs to analyze using the following code. I would like all the rows
between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan to return all these rows? I am using
SingleColumnValueFilter and adding a list which has the filters - filter1 and filter2.
This code returns the exact row if I use filter1 ("DEBUG:xxxxx") or filter2 ("yyyyy"),
but does not return any rows if used together in a list. I would like all the rows between these two rows.
Am I missing something?
Thanks,
Suresh
FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL); RegexStringComparator comp1 = new RegexStringComparator("DEBUG:xxxxx.");
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"),
CompareOp.EQUAL, comp1);
//filter1.setFilterIfMissing(true);
list.addFilter(filter1);
SubstringComparator comp2 = new SubstringComparator("yyyyy");
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"),
CompareOp.EQUAL, comp2);
//filter1.setFilterIfMissing(true);
list.addFilter(filter2);
scan.setFilter(list);
scanner = table.getScanner(scan);
System.out.println("Results of scan:");
for (Result result : scanner) {
for (KeyValue kv : result.raw()) {
System.out.print("ROW : " + new String(kv.getRow()) + " ");
System.out.print("Family : " + new String(kv.getFamily()) + " ");
System.out.print("Qualifier : " + new String(kv.getQualifier()) + " ");
System.out.println("KV: " + kv + ", Value: "
+ Bytes.toString(kv.getValue()));
}
}
scanner.close();
-
Re: Retrieving rows with specific values using SinglieColumnValueFilter
Norbert Burger 2012-10-16, 00:17
Try changing your CompareOp.EQUALs to CompareOp.GREATER_OR_EQUAL and CompareOp.LESS_OR_EQUAL, respectively. You want all rows between your two key.
Norbert
On Mon, Oct 15, 2012 at 7:00 PM, Kumar, Suresh <[EMAIL PROTECTED]> wrote: > I have a HBase with some apache logs loaded. > > > > I am trying to retrieve a section of logs to analyze using the following > code. I would like all the rows > > between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan to > return all these rows? I am using > > SingleColumnValueFilter and adding a list which has the filters - > filter1 and filter2. > > > > This code returns the exact row if I use filter1 ("DEBUG:xxxxx") or > filter2 ("yyyyy"), > > but does not return any rows if used together in a list. I would like > all the rows between these two rows. > > > > Am I missing something? > > > > Thanks, > > Suresh > > > > > > FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL); > > > RegexStringComparator comp1 = new > RegexStringComparator("DEBUG:xxxxx."); > > SingleColumnValueFilter filter1 = new > SingleColumnValueFilter( > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > CompareOp.EQUAL, comp1); > > //filter1.setFilterIfMissing(true); > > list.addFilter(filter1); > > > > > > SubstringComparator comp2 = new > SubstringComparator("yyyyy"); > > SingleColumnValueFilter filter2 = new > SingleColumnValueFilter( > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > CompareOp.EQUAL, comp2); > > //filter1.setFilterIfMissing(true); > > list.addFilter(filter2); > > > > scan.setFilter(list); > > > > scanner = table.getScanner(scan); > > System.out.println("Results of scan:"); > > for (Result result : scanner) { > > for (KeyValue kv : > result.raw()) { > > > System.out.print("ROW : " + new String(kv.getRow()) + " "); > > > System.out.print("Family : " + new String(kv.getFamily()) + " "); > > > System.out.print("Qualifier : " + new String(kv.getQualifier()) + " "); > > > System.out.println("KV: " + kv + ", Value: " > > > + Bytes.toString(kv.getValue())); > > } > > } > > scanner.close(); >
-
RE: Retrieving rows with specific values using SinglieColumnValueFilter
Kumar, Suresh 2012-10-16, 04:24
I tried that, it didn't work. I thought GREATER and LESS operators will not work for StringComparator.
I would like to use startRow() and stopRow() on a scan, but these operations are based on plain Strings and not regular expressions like I want.
Suresh
-----Original Message----- From: Norbert Burger [mailto:[EMAIL PROTECTED]] Sent: Monday, October 15, 2012 5:18 PM To: [EMAIL PROTECTED] Subject: Re: Retrieving rows with specific values using SinglieColumnValueFilter
Try changing your CompareOp.EQUALs to CompareOp.GREATER_OR_EQUAL and CompareOp.LESS_OR_EQUAL, respectively. You want all rows between your two key.
Norbert
On Mon, Oct 15, 2012 at 7:00 PM, Kumar, Suresh <[EMAIL PROTECTED]> wrote: > I have a HBase with some apache logs loaded. > > > > I am trying to retrieve a section of logs to analyze using the following > code. I would like all the rows > > between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan to > return all these rows? I am using > > SingleColumnValueFilter and adding a list which has the filters - > filter1 and filter2. > > > > This code returns the exact row if I use filter1 ("DEBUG:xxxxx") or > filter2 ("yyyyy"), > > but does not return any rows if used together in a list. I would like > all the rows between these two rows. > > > > Am I missing something? > > > > Thanks, > > Suresh > > > > > > FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL); > > > RegexStringComparator comp1 = new > RegexStringComparator("DEBUG:xxxxx."); > > SingleColumnValueFilter filter1 = new > SingleColumnValueFilter( > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > CompareOp.EQUAL, comp1); > > //filter1.setFilterIfMissing(true); > > list.addFilter(filter1); > > > > > > SubstringComparator comp2 = new > SubstringComparator("yyyyy"); > > SingleColumnValueFilter filter2 = new > SingleColumnValueFilter( > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > CompareOp.EQUAL, comp2); > > //filter1.setFilterIfMissing(true); > > list.addFilter(filter2); > > > > scan.setFilter(list); > > > > scanner = table.getScanner(scan); > > System.out.println("Results of scan:"); > > for (Result result : scanner) { > > for (KeyValue kv : > result.raw()) { > > > System.out.print("ROW : " + new String(kv.getRow()) + " "); > > > System.out.print("Family : " + new String(kv.getFamily()) + " "); > > > System.out.print("Qualifier : " + new String(kv.getQualifier()) + " "); > > > System.out.println("KV: " + kv + ", Value: " > > > + Bytes.toString(kv.getValue())); > > } > > } > > scanner.close(); >
-
RE: Retrieving rows with specific values using SinglieColumnValueFilter
Ramkrishna.S.Vasudevan 2012-10-16, 04:48
Hi Suresh
I would like to use startRow() and stopRow() on a scan, but these > operations To set the start and stopRow you need to know the rowkey.
between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan > to > > return all these rows? Have you made setFilterRowIfMissing(true). By default it is false.
Regards Ram > -----Original Message----- > From: Kumar, Suresh [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 16, 2012 9:55 AM > To: [EMAIL PROTECTED] > Subject: RE: Retrieving rows with specific values using > SinglieColumnValueFilter > > > I tried that, it didn't work. I thought GREATER and LESS operators will > not > work for StringComparator. > > I would like to use startRow() and stopRow() on a scan, but these > operations > are based on plain Strings and not regular expressions like I want. > > Suresh > > -----Original Message----- > From: Norbert Burger [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 15, 2012 5:18 PM > To: [EMAIL PROTECTED] > Subject: Re: Retrieving rows with specific values using > SinglieColumnValueFilter > > Try changing your CompareOp.EQUALs to CompareOp.GREATER_OR_EQUAL and > CompareOp.LESS_OR_EQUAL, respectively. You want all rows between your > two key. > > Norbert > > On Mon, Oct 15, 2012 at 7:00 PM, Kumar, Suresh <[EMAIL PROTECTED]> > wrote: > > I have a HBase with some apache logs loaded. > > > > > > > > I am trying to retrieve a section of logs to analyze using the > following > > code. I would like all the rows > > > > between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan > to > > return all these rows? I am using > > > > SingleColumnValueFilter and adding a list which has the filters - > > filter1 and filter2. > > > > > > > > This code returns the exact row if I use filter1 ("DEBUG:xxxxx") or > > filter2 ("yyyyy"), > > > > but does not return any rows if used together in a list. I would > like > > all the rows between these two rows. > > > > > > > > Am I missing something? > > > > > > > > Thanks, > > > > Suresh > > > > > > > > > > > > FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL); > > > > > > RegexStringComparator comp1 = new > > RegexStringComparator("DEBUG:xxxxx."); > > > > SingleColumnValueFilter filter1 = new > > SingleColumnValueFilter( > > > > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > > > > CompareOp.EQUAL, comp1); > > > > //filter1.setFilterIfMissing(true); > > > > list.addFilter(filter1); > > > > > > > > > > > > SubstringComparator comp2 = new > > SubstringComparator("yyyyy"); > > > > SingleColumnValueFilter filter2 = new > > SingleColumnValueFilter( > > > > > > Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), > > > > > > CompareOp.EQUAL, comp2); > > > > //filter1.setFilterIfMissing(true); > > > > list.addFilter(filter2); > > > > > > > > scan.setFilter(list); > > > > > > > > scanner = table.getScanner(scan); > > > > System.out.println("Results of > scan:"); > > > > for (Result result : scanner) { > > > > for (KeyValue kv : > > result.raw()) { > > > > > > System.out.print("ROW : " + new String(kv.getRow()) + " "); > > > > > > System.out.print("Family : " + new String(kv.getFamily()) + " "); > > > > > > System.out.print("Qualifier : " + new String(kv.getQualifier()) + " > "); > > > > > > System.out.println("KV: " + kv + ", Value: " > > > > > > + Bytes.toString(kv.getValue())); > > > > } > > > > } > > > > scanner.close(); > >
-
Re: Retrieving rows with specific values using SinglieColumnValueFilter
Alok Kumar 2012-10-16, 19:08
Hi,
You have set MUST_PASS_ALL. > "FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL);" which mean both (filter1 && filter2) must be true on any single row, And it ll never happen(both condition true), that's why it is not fetching any result when you are using it together.
As Norbert has suggested, You should try with CompareOp.GREATER_OR_EQUAL and CompareOp.LESS_OR_EQUAL, respectively. StringComparator does a lexological match.
n If I have understood correctly, this should work for you --------------------------------------------------------------------------------- FilterList filterList = new FilterList(Operator.MUST_PASS_ALL);
Filter filter1 = new SingleColumnValueFilter(Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), CompareOp.GREATER_OR_EQUAL, "DEBUG:xxxxx".getBytes()); filterList.addFilter(filter1); Filter filter2 = new SingleColumnValueFilter(Bytes.toBytes("mylogs"), Bytes.toBytes("pcol"), CompareOp.LESS_OR_EQUAL, "DEBUG:yyyyy".getBytes()); filterList.addFilter(filter2); Scan scan = new Scan(); // scan.setStartRow(startRow); // optional // scan.setStopRow(stopRow); // optional scan.setFilter(filterList);
HTablePool pool = new HTablePool(); HTableInterface table = pool.getTable("tableName"); ResultScanner scanner = null; try { scanner = table.getScanner(scan); for (Result result : scanner) { // you get your result. } } catch (IOException e) { e.printStackTrace(); } finally { if (scanner != null) { scanner.close(); } } ----------------------------------------------------------------------------------
Regards, Alok
On Tue, Oct 16, 2012 at 10:18 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote:
> Hi Suresh > > I would like to use startRow() and stopRow() on a scan, but these > > operations > To set the start and stopRow you need to know the rowkey. > > between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan > > to > > > return all these rows? > Have you made setFilterRowIfMissing(true). By default it is false. > > Regards > Ram > > > > -----Original Message----- > > From: Kumar, Suresh [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, October 16, 2012 9:55 AM > > To: [EMAIL PROTECTED] > > Subject: RE: Retrieving rows with specific values using > > SinglieColumnValueFilter > > > > > > I tried that, it didn't work. I thought GREATER and LESS operators will > > not > > work for StringComparator. > > > > I would like to use startRow() and stopRow() on a scan, but these > > operations > > are based on plain Strings and not regular expressions like I want. > > > > Suresh > > > > -----Original Message----- > > From: Norbert Burger [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 15, 2012 5:18 PM > > To: [EMAIL PROTECTED] > > Subject: Re: Retrieving rows with specific values using > > SinglieColumnValueFilter > > > > Try changing your CompareOp.EQUALs to CompareOp.GREATER_OR_EQUAL and > > CompareOp.LESS_OR_EQUAL, respectively. You want all rows between your > > two key. > > > > Norbert > > > > On Mon, Oct 15, 2012 at 7:00 PM, Kumar, Suresh <[EMAIL PROTECTED]> > > wrote: > > > I have a HBase with some apache logs loaded. > > > > > > > > > > > > I am trying to retrieve a section of logs to analyze using the > > following > > > code. I would like all the rows > > > > > > between column values "DEBUG:xxxxx" and "yyyyy". How can I force scan > > to > > > return all these rows? I am using > > > > > > SingleColumnValueFilter and adding a list which has the filters - > > > filter1 and filter2. > > > > > > > > > > > > This code returns the exact row if I use filter1 ("DEBUG:xxxxx") or > > > filter2 ("yyyyy"), > > > > > > but does not return any rows if used together in a list. I would > > like > > > all the rows between these two rows. > > > > > > > > > > > > Am I missing something? > > > > > > > > > > > > Thanks, > > > > > > Suresh > > > > > > > > > > > > > > > > > > FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL); > > > > > > > > > RegexStringComparator comp1 = new Alok Kumar
|
|