Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Re: [ANN]: HBaseWD: Distribute Sequential Writes in HBase


+
Alex Baranau 2011-05-11, 20:01
+
Alex Baranau 2011-05-11, 20:41
+
Weishung Chung 2011-05-13, 07:45
+
Alex Baranau 2011-05-13, 20:12
+
Weishung Chung 2011-05-14, 15:17
Copy link to this message
-
Re: [ANN]: HBaseWD: Distribute Sequential Writes in HBase
I have another question. For overwriting, do I need to delete the existing
one before re-writing it?

On Sat, May 14, 2011 at 10:17 AM, Weishung Chung <[EMAIL PROTECTED]> wrote:

> Yes, it's simple yet useful. I am integrating it. Thanks alot :)
>
>
> On Fri, May 13, 2011 at 3:12 PM, Alex Baranau <[EMAIL PROTECTED]>wrote:
>
>> Thanks for the interest!
>>
>> We are using it in production. It is simple and hence quite stable. Though
>> some minor pieces are missing (like
>> https://github.com/sematext/HBaseWD/issues/1) this doesn't affect
>> stability
>> and/or major functionality.
>>
>> Alex Baranau
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
>> HBase
>>
>> On Fri, May 13, 2011 at 10:45 AM, Weishung Chung <[EMAIL PROTECTED]>
>> wrote:
>>
>> > What's the status on this package? Is it mature enough?
>> >  I am using it in my project, tried out the write method yesterday and
>> > going
>> > to incorporate into read method tomorrow.
>> >
>> > On Wed, May 11, 2011 at 3:41 PM, Alex Baranau <[EMAIL PROTECTED]
>> > >wrote:
>> >
>> > > > The start/end rows may be written twice.
>> > >
>> > > Yeah, I know. I meant that size of startRow+stopRow data is "bearable"
>> in
>> > > attribute value no matter how long are they (keys), since we already
>> OK
>> > > with
>> > > transferring them initially (i.e. we should be OK with transferring 2x
>> > > times
>> > > more).
>> > >
>> > > So, what about the suggestion of sourceScan attribute value I
>> mentioned?
>> > If
>> > > you can tell why it isn't sufficient in your case, I'd have more info
>> to
>> > > think about better suggestion ;)
>> > >
>> > > > It is Okay to keep all versions of your patch in the JIRA.
>> > > > Maybe the second should be named HBASE-3811-v2.patch<
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12478694/HBASE-3811.patch
>> > > >?
>> > >
>> > > np. Can do that. Just thought that they (patches) can be sorted by
>> date
>> > to
>> > > find out the final one (aka "convention over naming-rules").
>> > >
>> > > Alex.
>> > >
>> > > On Wed, May 11, 2011 at 11:13 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> > >
>> > > > >> Though it might be ok, since we anyways "transfer" start/stop
>> rows
>> > > with
>> > > > Scan object.
>> > > > In write() method, we now have:
>> > > >     Bytes.writeByteArray(out, this.startRow);
>> > > >     Bytes.writeByteArray(out, this.stopRow);
>> > > > ...
>> > > >       for (Map.Entry<String, byte[]> attr :
>> this.attributes.entrySet())
>> > {
>> > > >         WritableUtils.writeString(out, attr.getKey());
>> > > >         Bytes.writeByteArray(out, attr.getValue());
>> > > >       }
>> > > > The start/end rows may be written twice.
>> > > >
>> > > > Of course, you have full control over how to generate the unique ID
>> for
>> > > > "sourceScan" attribute.
>> > > >
>> > > > It is Okay to keep all versions of your patch in the JIRA. Maybe the
>> > > second
>> > > > should be named HBASE-3811-v2.patch<
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12478694/HBASE-3811.patch
>> > > >?
>> > > >
>> > > > Thanks
>> > > >
>> > > >
>> > > > On Wed, May 11, 2011 at 1:01 PM, Alex Baranau <
>> > [EMAIL PROTECTED]
>> > > >wrote:
>> > > >
>> > > >> > Can you remove the first version ?
>> > > >> Isn't it ok to keep it in JIRA issue?
>> > > >>
>> > > >>
>> > > >> > In HBaseWD, can you use reflection to detect whether Scan
>> supports
>> > > >> setAttribute() ?
>> > > >> > If it does, can you encode start row and end row as "sourceScan"
>> > > >> attribute ?
>> > > >>
>> > > >> Yeah, smth like this is going to be implemented. Though I'd still
>> want
>> > > to
>> > > >> hear from the devs the story about Scan version.
>> > > >>
>> > > >>
>> > > >> > One consideration is that start row or end row may be quite long.
>> > > >>
>> > > >> Yeah, that is was my though too at first. Though it might be ok,
>> since
>> > > we
>> > > >> anyways "transfer" start/stop rows with Scan object.
+
Alex Baranau 2011-05-18, 15:03
+
Weishung Chung 2011-05-18, 21:15
+
Ted Yu 2011-05-18, 23:18
+
Weishung Chung 2011-05-19, 03:50
+
Alex Baranau 2011-05-19, 13:14
+
Weishung Chung 2011-05-19, 13:45
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB