|
|
-
ways to make orders when it puts
JUN YOUNG KIM 2012-10-04, 08:54
hi, hbase users.
I am wondering how we can make orders when we put under multiple threads. I mean that
threads are working like this
thread1 puts A1 (rowkey) thread2 puts A2 thread3 puts A3
by unexpected working time order, thread1 puts earlier than thread2. thread3 puts earlier than thread1.
yes, I know that hbase will store it in-order like A1 -> A2 -> A3
but, how could I store my datas by write-times like A3 -> A1 -> A2
If I could insert timestamp value before A#, the situations I described could be also happened.
any ideas?? (you can change row key structure if you can satisfy conditions I want to archive.)
thanks for your concerns.
+
JUN YOUNG KIM 2012-10-04, 08:54
-
Re: ways to make orders when it puts
Michael Segel 2012-10-04, 15:19
Silly question. Why do you care how your data is being stored?
Does it matter if the data is stored in rows where A1,A2, A3 are the order of the keys, or if its A3,A1,A2 ?
If you say that you want to store the rows in order based on entry time, you're going to also have to deal with a little nasty problem of hot spotting along with your regions being only half full post spilt. On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote:
> hi, hbase users. > > I am wondering how we can make orders when we put under multiple threads. > I mean that > > threads are working like this > > thread1 puts A1 (rowkey) > thread2 puts A2 > thread3 puts A3 > > by unexpected working time order, > thread1 puts earlier than thread2. > thread3 puts earlier than thread1. > > yes, I know that hbase will store it in-order like A1 -> A2 -> A3 > > but, how could I store my datas by write-times like A3 -> A1 -> A2 > > If I could insert timestamp value before A#, the situations I described could be also happened. > > any ideas?? > (you can change row key structure if you can satisfy conditions I want to archive.) > > thanks for your concerns. > >
+
Michael Segel 2012-10-04, 15:19
-
Re: ways to make orders when it puts
Dan Han 2012-10-04, 15:54
I am not sure I hit your questoin, but if the data is not stored as what you expect, I guess it might be the problem of row key. As we all know, the row key is sorted in a lexicographic order in HBase. For example, 10 is before 9. So if your row key includes 1 ... 10, it is neccessory to format the single letter by adding "0".
Best Wishes Dan Han
On Thu, Oct 4, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote:
> Silly question. Why do you care how your data is being stored? > > Does it matter if the data is stored in rows where A1,A2, A3 are the order > of the keys, or > if its A3,A1,A2 ? > > If you say that you want to store the rows in order based on entry time, > you're going to also have to deal with a little nasty problem of hot > spotting along with your regions being only half full post spilt. > > > On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote: > > > hi, hbase users. > > > > I am wondering how we can make orders when we put under multiple threads. > > I mean that > > > > threads are working like this > > > > thread1 puts A1 (rowkey) > > thread2 puts A2 > > thread3 puts A3 > > > > by unexpected working time order, > > thread1 puts earlier than thread2. > > thread3 puts earlier than thread1. > > > > yes, I know that hbase will store it in-order like A1 -> A2 -> A3 > > > > but, how could I store my datas by write-times like A3 -> A1 -> A2 > > > > If I could insert timestamp value before A#, the situations I described > could be also happened. > > > > any ideas?? > > (you can change row key structure if you can satisfy conditions I want > to archive.) > > > > thanks for your concerns. > > > > > >
+
Dan Han 2012-10-04, 15:54
-
Re: ways to make orders when it puts
Michael Segel 2012-10-04, 16:06
I took it that the OP wants to store the rows A1->A3 in the order in which they came in. So It could be A3,A1,A2 as an example. So to do this you end up prefixing the rowkey with a timestamp or something.
This is not a good idea, and I was curious as to why the order of entry was important to the OP. On Oct 4, 2012, at 10:54 AM, Dan Han <[EMAIL PROTECTED]> wrote:
> I am not sure I hit your questoin, but if the data is not stored as what > you expect, > I guess it might be the problem of row key. > As we all know, the row key is sorted in a lexicographic order in HBase. > For example, 10 is before 9. So if your row key includes 1 ... 10, > it is neccessory to format the single letter by adding "0". > > Best Wishes > Dan Han > > > > On Thu, Oct 4, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > >> Silly question. Why do you care how your data is being stored? >> >> Does it matter if the data is stored in rows where A1,A2, A3 are the order >> of the keys, or >> if its A3,A1,A2 ? >> >> If you say that you want to store the rows in order based on entry time, >> you're going to also have to deal with a little nasty problem of hot >> spotting along with your regions being only half full post spilt. >> >> >> On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote: >> >>> hi, hbase users. >>> >>> I am wondering how we can make orders when we put under multiple threads. >>> I mean that >>> >>> threads are working like this >>> >>> thread1 puts A1 (rowkey) >>> thread2 puts A2 >>> thread3 puts A3 >>> >>> by unexpected working time order, >>> thread1 puts earlier than thread2. >>> thread3 puts earlier than thread1. >>> >>> yes, I know that hbase will store it in-order like A1 -> A2 -> A3 >>> >>> but, how could I store my datas by write-times like A3 -> A1 -> A2 >>> >>> If I could insert timestamp value before A#, the situations I described >> could be also happened. >>> >>> any ideas?? >>> (you can change row key structure if you can satisfy conditions I want >> to archive.) >>> >>> thanks for your concerns. >>> >>> >> >>
+
Michael Segel 2012-10-04, 16:06
-
Re: ways to make orders when it puts
Henry JunYoung KIM 2012-10-05, 01:58
yes, this needs for our indexer for datas.
I mean that hbase need to store some kinds of data list based on entry time and then by indexer, It will try to search new data list by a start-key and a limit count.
for easy understanding,
If I used a timestamp row key,
ts data 1 D1 2 D2 3 D3 4 D4
an indexer has done its job until ts 3. but at this point, an one of threads stored its data into a row key ts 2.
this situation will give rise to lose of ts 2 for an indexer even if it should be indexed.
because of this, I gave a questions to store datas based on entry time.
thanks for your concerns. 2012. 10. 5., 오전 1:06, Michael Segel <[EMAIL PROTECTED]> 작성:
> I took it that the OP wants to store the rows A1->A3 in the order in which they came in. So It could be A3,A1,A2 as an example. > So to do this you end up prefixing the rowkey with a timestamp or something. > > This is not a good idea, and I was curious as to why the order of entry was important to the OP. > > > On Oct 4, 2012, at 10:54 AM, Dan Han <[EMAIL PROTECTED]> wrote: > >> I am not sure I hit your questoin, but if the data is not stored as what >> you expect, >> I guess it might be the problem of row key. >> As we all know, the row key is sorted in a lexicographic order in HBase. >> For example, 10 is before 9. So if your row key includes 1 ... 10, >> it is neccessory to format the single letter by adding "0". >> >> Best Wishes >> Dan Han >> >> >> >> On Thu, Oct 4, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote: >> >>> Silly question. Why do you care how your data is being stored? >>> >>> Does it matter if the data is stored in rows where A1,A2, A3 are the order >>> of the keys, or >>> if its A3,A1,A2 ? >>> >>> If you say that you want to store the rows in order based on entry time, >>> you're going to also have to deal with a little nasty problem of hot >>> spotting along with your regions being only half full post spilt. >>> >>> >>> On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote: >>> >>>> hi, hbase users. >>>> >>>> I am wondering how we can make orders when we put under multiple threads. >>>> I mean that >>>> >>>> threads are working like this >>>> >>>> thread1 puts A1 (rowkey) >>>> thread2 puts A2 >>>> thread3 puts A3 >>>> >>>> by unexpected working time order, >>>> thread1 puts earlier than thread2. >>>> thread3 puts earlier than thread1. >>>> >>>> yes, I know that hbase will store it in-order like A1 -> A2 -> A3 >>>> >>>> but, how could I store my datas by write-times like A3 -> A1 -> A2 >>>> >>>> If I could insert timestamp value before A#, the situations I described >>> could be also happened. >>>> >>>> any ideas?? >>>> (you can change row key structure if you can satisfy conditions I want >>> to archive.) >>>> >>>> thanks for your concerns. >>>> >>>> >>> >>> >
+
Henry JunYoung KIM 2012-10-05, 01:58
-
Re: ways to make orders when it puts
Michael Segel 2012-10-05, 10:58
You need to be a bit more specific. Your design doesn't make any sense and you're now starting a separate thread on this topic...
On Oct 4, 2012, at 8:58 PM, Henry JunYoung KIM <[EMAIL PROTECTED]> wrote:
> yes, this needs for our indexer for datas. > > I mean that hbase need to store some kinds of data list based on entry time and then by indexer, It will try to search new data list by a start-key and a limit count. > > for easy understanding, > > If I used a timestamp row key, > > ts data > 1 D1 > 2 D2 > 3 D3 > 4 D4 > > an indexer has done its job until ts 3. > but at this point, an one of threads stored its data into a row key ts 2. > > this situation will give rise to lose of ts 2 for an indexer even if it should be indexed. > > because of this, I gave a questions to store datas based on entry time. > > thanks for your concerns. > > > 2012. 10. 5., 오전 1:06, Michael Segel <[EMAIL PROTECTED]> 작성: > >> I took it that the OP wants to store the rows A1->A3 in the order in which they came in. So It could be A3,A1,A2 as an example. >> So to do this you end up prefixing the rowkey with a timestamp or something. >> >> This is not a good idea, and I was curious as to why the order of entry was important to the OP. >> >> >> On Oct 4, 2012, at 10:54 AM, Dan Han <[EMAIL PROTECTED]> wrote: >> >>> I am not sure I hit your questoin, but if the data is not stored as what >>> you expect, >>> I guess it might be the problem of row key. >>> As we all know, the row key is sorted in a lexicographic order in HBase. >>> For example, 10 is before 9. So if your row key includes 1 ... 10, >>> it is neccessory to format the single letter by adding "0". >>> >>> Best Wishes >>> Dan Han >>> >>> >>> >>> On Thu, Oct 4, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote: >>> >>>> Silly question. Why do you care how your data is being stored? >>>> >>>> Does it matter if the data is stored in rows where A1,A2, A3 are the order >>>> of the keys, or >>>> if its A3,A1,A2 ? >>>> >>>> If you say that you want to store the rows in order based on entry time, >>>> you're going to also have to deal with a little nasty problem of hot >>>> spotting along with your regions being only half full post spilt. >>>> >>>> >>>> On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote: >>>> >>>>> hi, hbase users. >>>>> >>>>> I am wondering how we can make orders when we put under multiple threads. >>>>> I mean that >>>>> >>>>> threads are working like this >>>>> >>>>> thread1 puts A1 (rowkey) >>>>> thread2 puts A2 >>>>> thread3 puts A3 >>>>> >>>>> by unexpected working time order, >>>>> thread1 puts earlier than thread2. >>>>> thread3 puts earlier than thread1. >>>>> >>>>> yes, I know that hbase will store it in-order like A1 -> A2 -> A3 >>>>> >>>>> but, how could I store my datas by write-times like A3 -> A1 -> A2 >>>>> >>>>> If I could insert timestamp value before A#, the situations I described >>>> could be also happened. >>>>> >>>>> any ideas?? >>>>> (you can change row key structure if you can satisfy conditions I want >>>> to archive.) >>>>> >>>>> thanks for your concerns. >>>>> >>>>> >>>> >>>> >> > >
+
Michael Segel 2012-10-05, 10:58
|
|