|
Nurettin Şimşek
2013-02-13, 08:35
Alexander Ignatov
2013-02-13, 08:40
Amit Sela
2013-02-13, 09:01
Nurettin Şimşek
2013-02-13, 09:42
Jean-Marc Spaggiari
2013-02-13, 12:06
Nurettin Şimşek
2013-02-13, 20:03
lars hofhansl
2013-02-14, 00:50
Jean-Marc Spaggiari
2013-02-14, 02:09
Ted Yu
2013-02-14, 03:18
Mehmet Simsek
2013-02-14, 03:41
Ted Yu
2013-02-14, 03:58
Jean-Marc Spaggiari
2013-02-25, 02:25
|
-
RowKey design with hashingNurettin Şimşek 2013-02-13, 08:35
Hi All,
In our project mail adresses are row key. Which rowkey design we should choose? 1) com.yahoo@xxxx (Reversed) 2) [EMAIL PROTECTED] 3) md5 hash([EMAIL PROTECTED]) 4) Any other solution. Many thanks. -- M. Nurettin ŞİMŞEK
-
Re: RowKey design with hashingAlexander Ignatov 2013-02-13, 08:40
If you have only one domain 'yahoo.com' for all mail addresses you
probably can use row keys as 'xxxx' without adding '@yahoo.com'. -- Regards, Alexander Ignatov On 2/13/2013 12:35 PM, Nurettin Şimşek wrote: > Hi All, > > In our project mail adresses are row key. Which rowkey design we should > choose? > > 1) com.yahoo@xxxx (Reversed) > 2) [EMAIL PROTECTED] > 3) md5 hash([EMAIL PROTECTED]) > 4) Any other solution. > > Many thanks. >
-
Re: RowKey design with hashingAmit Sela 2013-02-13, 09:01
If you have a good distribution of domains then use the reversed domain
key, it will allow you to scan over domains faster. On Wed, Feb 13, 2013 at 10:40 AM, Alexander Ignatov <[EMAIL PROTECTED]>wrote: > If you have only one domain 'yahoo.com' for all mail addresses you > probably can use row keys as 'xxxx' without adding '@yahoo.com'. > > -- > Regards, > Alexander Ignatov > > > > On 2/13/2013 12:35 PM, Nurettin Şimşek wrote: > >> Hi All, >> >> In our project mail adresses are row key. Which rowkey design we should >> choose? >> >> 1) com.yahoo@xxxx (Reversed) >> 2) [EMAIL PROTECTED] >> 3) md5 hash([EMAIL PROTECTED]) >> 4) Any other solution. >> >> Many thanks. >> >> >
-
Re: RowKey design with hashingNurettin Şimşek 2013-02-13, 09:42
I want to search email adress equality. There are many many domains not
only yahoo. What is disadvantages of using hashing?
-
Re: RowKey design with hashingJean-Marc Spaggiari 2013-02-13, 12:06
I don't see any issue with #2 and it might be the simplest one. But
all will depend on your read pattern. If you need to scan by domain, 1 is better. I you need to list the emails without knowing it, 2 might be better. If you only access it given a specific address, 3 can be good. So I will say, all depend on what you want to do with it... 2013/2/13, Nurettin Şimşek <[EMAIL PROTECTED]>: > I want to search email adress equality. There are many many domains not > only yahoo. > > What is disadvantages of using hashing? >
-
Re: RowKey design with hashingNurettin Şimşek 2013-02-13, 20:03
Thanks Jean,
3 can be good for us.
-
Re: RowKey design with hashinglars hofhansl 2013-02-14, 00:50
Depends on you search pattern.
If you never care about scans ordering i.e. you only do point gets to see whether you've already seen an email address, do the hash part. I'd perfer #1 over #2, because it would let you do efficient key prefix block encoding (FAST_DIFF). -- Lars ________________________________ From: Nurettin Şimşek <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wednesday, February 13, 2013 12:35 AM Subject: RowKey design with hashing Hi All, In our project mail adresses are row key. Which rowkey design we should choose? 1) com.yahoo@xxxx (Reversed) 2) [EMAIL PROTECTED] 3) md5 hash([EMAIL PROTECTED]) 4) Any other solution. Many thanks. -- M. Nurettin ŞİMŞEK
-
Re: RowKey design with hashingJean-Marc Spaggiari 2013-02-14, 02:09
Hi Lars,
Can you please tell more about key prefix block encoding? Or refer to some blog/doc? How it works, what it is, etc.? Thanks, JM 2013/2/13, lars hofhansl <[EMAIL PROTECTED]>: > Depends on you search pattern. > If you never care about scans ordering i.e. you only do point gets to see > whether you've already seen an email address, do the hash part. > > I'd perfer #1 over #2, because it would let you do efficient key prefix > block encoding (FAST_DIFF). > > -- Lars > > > > ________________________________ > From: Nurettin Şimşek <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Wednesday, February 13, 2013 12:35 AM > Subject: RowKey design with hashing > > Hi All, > > In our project mail adresses are row key. Which rowkey design we should > choose? > > 1) com.yahoo@xxxx (Reversed) > 2) [EMAIL PROTECTED] > 3) md5 hash([EMAIL PROTECTED]) > 4) Any other solution. > > Many thanks. > > -- > M. Nurettin ŞİMŞEK
-
Re: RowKey design with hashingTed Yu 2013-02-14, 03:18
Jean-Marc:
You can find almost all the details you need from this JIRA: HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / prefix compression) Cheers On Wed, Feb 13, 2013 at 6:09 PM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > Hi Lars, > > Can you please tell more about key prefix block encoding? Or refer to > some blog/doc? How it works, what it is, etc.? > > Thanks, > > JM > > 2013/2/13, lars hofhansl <[EMAIL PROTECTED]>: > > Depends on you search pattern. > > If you never care about scans ordering i.e. you only do point gets to see > > whether you've already seen an email address, do the hash part. > > > > I'd perfer #1 over #2, because it would let you do efficient key prefix > > block encoding (FAST_DIFF). > > > > -- Lars > > > > > > > > ________________________________ > > From: Nurettin Şimşek <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED] > > Sent: Wednesday, February 13, 2013 12:35 AM > > Subject: RowKey design with hashing > > > > Hi All, > > > > In our project mail adresses are row key. Which rowkey design we should > > choose? > > > > 1) com.yahoo@xxxx (Reversed) > > 2) [EMAIL PROTECTED] > > 3) md5 hash([EMAIL PROTECTED]) > > 4) Any other solution. > > > > Many thanks. > > > > -- > > M. Nurettin ŞİMŞEK >
-
Re: RowKey design with hashingMehmet Simsek 2013-02-14, 03:41
Thanks Lars
M.Nurettin Şimşek On 14 Şub 2013, at 05:18, Ted Yu <[EMAIL PROTECTED]> wrote: > Jean-Marc: > You can find almost all the details you need from this JIRA: > HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / prefix > compression) > > Cheers > > On Wed, Feb 13, 2013 at 6:09 PM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi Lars, >> >> Can you please tell more about key prefix block encoding? Or refer to >> some blog/doc? How it works, what it is, etc.? >> >> Thanks, >> >> JM >> >> 2013/2/13, lars hofhansl <[EMAIL PROTECTED]>: >>> Depends on you search pattern. >>> If you never care about scans ordering i.e. you only do point gets to see >>> whether you've already seen an email address, do the hash part. >>> >>> I'd perfer #1 over #2, because it would let you do efficient key prefix >>> block encoding (FAST_DIFF). >>> >>> -- Lars >>> >>> >>> >>> ________________________________ >>> From: Nurettin Şimşek <[EMAIL PROTECTED]> >>> To: [EMAIL PROTECTED] >>> Sent: Wednesday, February 13, 2013 12:35 AM >>> Subject: RowKey design with hashing >>> >>> Hi All, >>> >>> In our project mail adresses are row key. Which rowkey design we should >>> choose? >>> >>> 1) com.yahoo@xxxx (Reversed) >>> 2) [EMAIL PROTECTED] >>> 3) md5 hash([EMAIL PROTECTED]) >>> 4) Any other solution. >>> >>> Many thanks. >>> >>> -- >>> M. Nurettin ŞİMŞEK >>
-
Re: RowKey design with hashingTed Yu 2013-02-14, 03:58
My name is Ted, not Lars :-)
On Wed, Feb 13, 2013 at 7:41 PM, Mehmet Simsek <[EMAIL PROTECTED]>wrote: > Thanks Lars > > M.Nurettin Şimşek > > On 14 Şub 2013, at 05:18, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Jean-Marc: > > You can find almost all the details you need from this JIRA: > > HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / prefix > > compression) > > > > Cheers > > > > On Wed, Feb 13, 2013 at 6:09 PM, Jean-Marc Spaggiari < > > [EMAIL PROTECTED]> wrote: > > > >> Hi Lars, > >> > >> Can you please tell more about key prefix block encoding? Or refer to > >> some blog/doc? How it works, what it is, etc.? > >> > >> Thanks, > >> > >> JM > >> > >> 2013/2/13, lars hofhansl <[EMAIL PROTECTED]>: > >>> Depends on you search pattern. > >>> If you never care about scans ordering i.e. you only do point gets to > see > >>> whether you've already seen an email address, do the hash part. > >>> > >>> I'd perfer #1 over #2, because it would let you do efficient key prefix > >>> block encoding (FAST_DIFF). > >>> > >>> -- Lars > >>> > >>> > >>> > >>> ________________________________ > >>> From: Nurettin Şimşek <[EMAIL PROTECTED]> > >>> To: [EMAIL PROTECTED] > >>> Sent: Wednesday, February 13, 2013 12:35 AM > >>> Subject: RowKey design with hashing > >>> > >>> Hi All, > >>> > >>> In our project mail adresses are row key. Which rowkey design we > should > >>> choose? > >>> > >>> 1) com.yahoo@xxxx (Reversed) > >>> 2) [EMAIL PROTECTED] > >>> 3) md5 hash([EMAIL PROTECTED]) > >>> 4) Any other solution. > >>> > >>> Many thanks. > >>> > >>> -- > >>> M. Nurettin ŞİMŞEK > >> >
-
Re: RowKey design with hashingJean-Marc Spaggiari 2013-02-25, 02:25
Hi Ted,
Thanks for pointing me to HBASE-4218. I will take a look at it. JM 2013/2/13 Ted Yu <[EMAIL PROTECTED]> > My name is Ted, not Lars :-) > > On Wed, Feb 13, 2013 at 7:41 PM, Mehmet Simsek <[EMAIL PROTECTED] > >wrote: > > > Thanks Lars > > > > M.Nurettin Şimşek > > > > On 14 Şub 2013, at 05:18, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > Jean-Marc: > > > You can find almost all the details you need from this JIRA: > > > HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / > prefix > > > compression) > > > > > > Cheers > > > > > > On Wed, Feb 13, 2013 at 6:09 PM, Jean-Marc Spaggiari < > > > [EMAIL PROTECTED]> wrote: > > > > > >> Hi Lars, > > >> > > >> Can you please tell more about key prefix block encoding? Or refer to > > >> some blog/doc? How it works, what it is, etc.? > > >> > > >> Thanks, > > >> > > >> JM > > >> > > >> 2013/2/13, lars hofhansl <[EMAIL PROTECTED]>: > > >>> Depends on you search pattern. > > >>> If you never care about scans ordering i.e. you only do point gets to > > see > > >>> whether you've already seen an email address, do the hash part. > > >>> > > >>> I'd perfer #1 over #2, because it would let you do efficient key > prefix > > >>> block encoding (FAST_DIFF). > > >>> > > >>> -- Lars > > >>> > > >>> > > >>> > > >>> ________________________________ > > >>> From: Nurettin Şimşek <[EMAIL PROTECTED]> > > >>> To: [EMAIL PROTECTED] > > >>> Sent: Wednesday, February 13, 2013 12:35 AM > > >>> Subject: RowKey design with hashing > > >>> > > >>> Hi All, > > >>> > > >>> In our project mail adresses are row key. Which rowkey design we > > should > > >>> choose? > > >>> > > >>> 1) com.yahoo@xxxx (Reversed) > > >>> 2) [EMAIL PROTECTED] > > >>> 3) md5 hash([EMAIL PROTECTED]) > > >>> 4) Any other solution. > > >>> > > >>> Many thanks. > > >>> > > >>> -- > > >>> M. Nurettin ŞİMŞEK > > >> > > > |