|
|
-
Is it possible to indicate the column scan order when scanning table?
yonghu 2013-02-07, 17:23
Dear all,
I wonder if it is possible to indicate the column scan order when scanning table. For example, if I have two column families cf1 and cf2 and I create a scan object. Is the table scanning order of scan.addFamily(cf1) and scan.addFamily(cf2) is as same as scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, is it possible to indicate the scanning order of table?
regards!
Yong
-
Re: Is it possible to indicate the column scan order when scanning table?
Ted Yu 2013-02-07, 17:29
Can you give us the use case where the scanning order is significant ?
Thanks
On Thu, Feb 7, 2013 at 9:23 AM, yonghu <[EMAIL PROTECTED]> wrote:
> Dear all, > > I wonder if it is possible to indicate the column scan order when > scanning table. For example, if I have two column families cf1 and cf2 > and I create a scan object. Is the table scanning order of > scan.addFamily(cf1) and scan.addFamily(cf2) is as same as > scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, > is it possible to indicate the scanning order of table? > > regards! > > Yong >
-
Re: Is it possible to indicate the column scan order when scanning table?
yonghu 2013-02-07, 20:01
Like a table can contain ttl data and static data without indicating ttl. So, I want to first scan the columns which have ttl restrictions and later the static columns. The goal that I want to achieve is to reduce the data missing due to ttl expiration during the scan.
regards!
Yong
On Thu, Feb 7, 2013 at 6:29 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Can you give us the use case where the scanning order is significant ? > > Thanks > > On Thu, Feb 7, 2013 at 9:23 AM, yonghu <[EMAIL PROTECTED]> wrote: > >> Dear all, >> >> I wonder if it is possible to indicate the column scan order when >> scanning table. For example, if I have two column families cf1 and cf2 >> and I create a scan object. Is the table scanning order of >> scan.addFamily(cf1) and scan.addFamily(cf2) is as same as >> scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, >> is it possible to indicate the scanning order of table? >> >> regards! >> >> Yong >>
-
Re: Is it possible to indicate the column scan order when scanning table?
Sergey Shelukhin 2013-02-07, 21:07
CFs are scanned in parallel in HBASE, and each row is built; scanning entire CF and then building rows by scanning entire different CF wouldn't scale very well. Do you filter data on ttl column family?
On Thu, Feb 7, 2013 at 12:01 PM, yonghu <[EMAIL PROTECTED]> wrote:
> Like a table can contain ttl data and static data without indicating > ttl. So, I want to first scan the columns which have ttl restrictions > and later the static columns. The goal that I want to achieve is to > reduce the data missing due to ttl expiration during the scan. > > regards! > > Yong > > On Thu, Feb 7, 2013 at 6:29 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > Can you give us the use case where the scanning order is significant ? > > > > Thanks > > > > On Thu, Feb 7, 2013 at 9:23 AM, yonghu <[EMAIL PROTECTED]> wrote: > > > >> Dear all, > >> > >> I wonder if it is possible to indicate the column scan order when > >> scanning table. For example, if I have two column families cf1 and cf2 > >> and I create a scan object. Is the table scanning order of > >> scan.addFamily(cf1) and scan.addFamily(cf2) is as same as > >> scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, > >> is it possible to indicate the scanning order of table? > >> > >> regards! > >> > >> Yong > >> >
-
Re: Is it possible to indicate the column scan order when scanning table?
Ted Yu 2013-02-07, 21:11
Yonghu: You may want to take a look at HBASE-5416: Improve performance of scans with some kind of filters. It would be in the upcoming 0.94.5 release.
You can designate an essential column family. Based on the result from this column family, extra column family can be scanned.
Cheers
On Thu, Feb 7, 2013 at 1:07 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:
> CFs are scanned in parallel in HBASE, and each row is built; scanning > entire CF and then building rows by scanning entire different CF wouldn't > scale very well. > Do you filter data on ttl column family? > > On Thu, Feb 7, 2013 at 12:01 PM, yonghu <[EMAIL PROTECTED]> wrote: > > > Like a table can contain ttl data and static data without indicating > > ttl. So, I want to first scan the columns which have ttl restrictions > > and later the static columns. The goal that I want to achieve is to > > reduce the data missing due to ttl expiration during the scan. > > > > regards! > > > > Yong > > > > On Thu, Feb 7, 2013 at 6:29 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Can you give us the use case where the scanning order is significant ? > > > > > > Thanks > > > > > > On Thu, Feb 7, 2013 at 9:23 AM, yonghu <[EMAIL PROTECTED]> wrote: > > > > > >> Dear all, > > >> > > >> I wonder if it is possible to indicate the column scan order when > > >> scanning table. For example, if I have two column families cf1 and cf2 > > >> and I create a scan object. Is the table scanning order of > > >> scan.addFamily(cf1) and scan.addFamily(cf2) is as same as > > >> scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, > > >> is it possible to indicate the scanning order of table? > > >> > > >> regards! > > >> > > >> Yong > > >> > > >
-
Re: Is it possible to indicate the column scan order when scanning table?
yonghu 2013-02-08, 05:36
Thanks for your response. I will take a look.
yong On Thu, Feb 7, 2013 at 10:11 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Yonghu: > You may want to take a look at HBASE-5416: Improve performance of scans > with some kind of filters. > It would be in the upcoming 0.94.5 release. > > You can designate an essential column family. Based on the result from this > column family, extra column family can be scanned. > > Cheers > > On Thu, Feb 7, 2013 at 1:07 PM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote: > >> CFs are scanned in parallel in HBASE, and each row is built; scanning >> entire CF and then building rows by scanning entire different CF wouldn't >> scale very well. >> Do you filter data on ttl column family? >> >> On Thu, Feb 7, 2013 at 12:01 PM, yonghu <[EMAIL PROTECTED]> wrote: >> >> > Like a table can contain ttl data and static data without indicating >> > ttl. So, I want to first scan the columns which have ttl restrictions >> > and later the static columns. The goal that I want to achieve is to >> > reduce the data missing due to ttl expiration during the scan. >> > >> > regards! >> > >> > Yong >> > >> > On Thu, Feb 7, 2013 at 6:29 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >> > > Can you give us the use case where the scanning order is significant ? >> > > >> > > Thanks >> > > >> > > On Thu, Feb 7, 2013 at 9:23 AM, yonghu <[EMAIL PROTECTED]> wrote: >> > > >> > >> Dear all, >> > >> >> > >> I wonder if it is possible to indicate the column scan order when >> > >> scanning table. For example, if I have two column families cf1 and cf2 >> > >> and I create a scan object. Is the table scanning order of >> > >> scan.addFamily(cf1) and scan.addFamily(cf2) is as same as >> > >> scan.addFamily(cf2) and scan.addFamily(cf1)? If it's the same order, >> > >> is it possible to indicate the scanning order of table? >> > >> >> > >> regards! >> > >> >> > >> Yong >> > >> >> > >>
|
|