Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - ways to make orders when it puts


Copy link to this message
-
Re: ways to make orders when it puts
Henry JunYoung KIM 2012-10-05, 01:58
yes, this needs for our indexer for datas.

I mean that hbase need to store some kinds of data list based on entry time and then by indexer, It will try to search new data list by a start-key and a limit count.

for easy understanding,

If I used a timestamp row key,

ts    data
1       D1
2       D2
3       D3
4       D4

an indexer has done its job until ts 3.
but at this point, an one of threads stored its data into a row key ts 2.

this situation will give rise to lose of ts 2 for an indexer even if it should be indexed.

because of this, I gave a questions to store datas based on entry time.

thanks for your concerns.
2012. 10. 5., 오전 1:06, Michael Segel <[EMAIL PROTECTED]> 작성:

> I took it that the OP wants to store the rows A1->A3 in the order in which they came in. So  It could be A3,A1,A2  as an example.
> So to do this you end up prefixing the rowkey with a timestamp or something.
>
> This is not a good idea, and I was curious as to why the order of entry was important to the OP.
>
>
> On Oct 4, 2012, at 10:54 AM, Dan Han <[EMAIL PROTECTED]> wrote:
>
>> I am not sure I hit your questoin, but if the data is not stored as what
>> you expect,
>> I guess it might be the problem of row key.
>> As we all know, the row key is sorted in a lexicographic order in HBase.
>> For example, 10 is before 9. So if your row key includes 1 ... 10,
>> it is neccessory to format the single letter by adding "0".
>>
>> Best Wishes
>> Dan Han
>>
>>
>>
>> On Thu, Oct 4, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote:
>>
>>> Silly question. Why do you care how your data is being stored?
>>>
>>> Does it matter if the data is stored in rows where A1,A2, A3 are the order
>>> of the keys, or
>>> if its A3,A1,A2 ?
>>>
>>> If you say that you want to store the rows in order based on entry time,
>>> you're going to also have to deal with a little nasty problem of hot
>>> spotting along with your regions being only half full post spilt.
>>>
>>>
>>> On Oct 4, 2012, at 3:54 AM, JUN YOUNG KIM <[EMAIL PROTECTED]> wrote:
>>>
>>>> hi, hbase users.
>>>>
>>>> I am wondering how we can make orders when we put under multiple threads.
>>>> I mean that
>>>>
>>>> threads are working like this
>>>>
>>>> thread1 puts A1 (rowkey)
>>>> thread2 puts A2
>>>> thread3 puts A3
>>>>
>>>> by unexpected working time order,
>>>> thread1 puts earlier than thread2.
>>>> thread3 puts earlier than thread1.
>>>>
>>>> yes, I know that hbase will store it in-order like A1 -> A2 -> A3
>>>>
>>>> but, how could I store my datas by write-times like A3 -> A1 -> A2
>>>>
>>>> If I could insert timestamp value before A#, the situations I described
>>> could be also happened.
>>>>
>>>> any ideas??
>>>> (you can change row key structure if you can satisfy conditions I want
>>> to archive.)
>>>>
>>>> thanks for your concerns.
>>>>
>>>>
>>>
>>>
>