Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Re: [ANN]: HBaseWD: Distribute Sequential Writes in HBase


+
Alex Baranau 2011-05-11, 20:01
+
Alex Baranau 2011-05-11, 20:41
+
Weishung Chung 2011-05-13, 07:45
+
Alex Baranau 2011-05-13, 20:12
+
Weishung Chung 2011-05-14, 15:17
Copy link to this message
-
Re: [ANN]: HBaseWD: Distribute Sequential Writes in HBase
Weishung Chung 2011-05-17, 22:19
I have another question. For overwriting, do I need to delete the existing
one before re-writing it?

On Sat, May 14, 2011 at 10:17 AM, Weishung Chung <[EMAIL PROTECTED]> wrote:

> Yes, it's simple yet useful. I am integrating it. Thanks alot :)
>
>
> On Fri, May 13, 2011 at 3:12 PM, Alex Baranau <[EMAIL PROTECTED]>wrote:
>
>> Thanks for the interest!
>>
>> We are using it in production. It is simple and hence quite stable. Though
>> some minor pieces are missing (like
>> https://github.com/sematext/HBaseWD/issues/1) this doesn't affect
>> stability
>> and/or major functionality.
>>
>> Alex Baranau
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
>> HBase
>>
>> On Fri, May 13, 2011 at 10:45 AM, Weishung Chung <[EMAIL PROTECTED]>
>> wrote:
>>
>> > What's the status on this package? Is it mature enough?
>> >  I am using it in my project, tried out the write method yesterday and
>> > going
>> > to incorporate into read method tomorrow.
>> >
>> > On Wed, May 11, 2011 at 3:41 PM, Alex Baranau <[EMAIL PROTECTED]
>> > >wrote:
>> >
>> > > > The start/end rows may be written twice.
>> > >
>> > > Yeah, I know. I meant that size of startRow+stopRow data is "bearable"
>> in
>> > > attribute value no matter how long are they (keys), since we already
>> OK
>> > > with
>> > > transferring them initially (i.e. we should be OK with transferring 2x
>> > > times
>> > > more).
>> > >
>> > > So, what about the suggestion of sourceScan attribute value I
>> mentioned?
>> > If
>> > > you can tell why it isn't sufficient in your case, I'd have more info
>> to
>> > > think about better suggestion ;)
>> > >
>> > > > It is Okay to keep all versions of your patch in the JIRA.
>> > > > Maybe the second should be named HBASE-3811-v2.patch<
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12478694/HBASE-3811.patch
>> > > >?
>> > >
>> > > np. Can do that. Just thought that they (patches) can be sorted by
>> date
>> > to
>> > > find out the final one (aka "convention over naming-rules").
>> > >
>> > > Alex.
>> > >
>> > > On Wed, May 11, 2011 at 11:13 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> > >
>> > > > >> Though it might be ok, since we anyways "transfer" start/stop
>> rows
>> > > with
>> > > > Scan object.
>> > > > In write() method, we now have:
>> > > >     Bytes.writeByteArray(out, this.startRow);
>> > > >     Bytes.writeByteArray(out, this.stopRow);
>> > > > ...
>> > > >       for (Map.Entry<String, byte[]> attr :
>> this.attributes.entrySet())
>> > {
>> > > >         WritableUtils.writeString(out, attr.getKey());
>> > > >         Bytes.writeByteArray(out, attr.getValue());
>> > > >       }
>> > > > The start/end rows may be written twice.
>> > > >
>> > > > Of course, you have full control over how to generate the unique ID
>> for
>> > > > "sourceScan" attribute.
>> > > >
>> > > > It is Okay to keep all versions of your patch in the JIRA. Maybe the
>> > > second
>> > > > should be named HBASE-3811-v2.patch<
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12478694/HBASE-3811.patch
>> > > >?
>> > > >
>> > > > Thanks
>> > > >
>> > > >
>> > > > On Wed, May 11, 2011 at 1:01 PM, Alex Baranau <
>> > [EMAIL PROTECTED]
>> > > >wrote:
>> > > >
>> > > >> > Can you remove the first version ?
>> > > >> Isn't it ok to keep it in JIRA issue?
>> > > >>
>> > > >>
>> > > >> > In HBaseWD, can you use reflection to detect whether Scan
>> supports
>> > > >> setAttribute() ?
>> > > >> > If it does, can you encode start row and end row as "sourceScan"
>> > > >> attribute ?
>> > > >>
>> > > >> Yeah, smth like this is going to be implemented. Though I'd still
>> want
>> > > to
>> > > >> hear from the devs the story about Scan version.
>> > > >>
>> > > >>
>> > > >> > One consideration is that start row or end row may be quite long.
>> > > >>
>> > > >> Yeah, that is was my though too at first. Though it might be ok,
>> since
>> > > we
>> > > >> anyways "transfer" start/stop rows with Scan object.
+
Alex Baranau 2011-05-18, 15:03
+
Weishung Chung 2011-05-18, 21:15
+
Ted Yu 2011-05-18, 23:18
+
Weishung Chung 2011-05-19, 03:50
+
Alex Baranau 2011-05-19, 13:14
+
Weishung Chung 2011-05-19, 13:45