Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Pig 0.8 HBaseStorage patch


Copy link to this message
-
Re: Pig 0.8 HBaseStorage patch
Corbin Hoenes 2011-01-26, 20:29
Okay I've created a JIRA and submitted the patch.  It's my first patch so
please educate me on proper etiquette.

https://issues.apache.org/jira/browse/PIG-1825

On Wed, Jan 26, 2011 at 12:43 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> Yahoo is not a huge user of Pig and HBase together yet, so my response to
> this is theoretical rather than based on my need.  But if your work produces
> a significant improvement I would definitely say it is worth contributing.
>  Even if it does not get checked in because we migrate the trunk to work
> with the latest HBase (which maybe already has the work in it) it's still
> worthwhile to have the patch in the JIRA so that those who are using Pig
> with older HBase can apply it to their code and get the benefits.
>
> This functionality should definitely be configurable, since it has
> correctness implications.
>
> Alan.
>
>
> On Jan 24, 2011, at 1:22 PM, Corbin Hoenes wrote:
>
>  We've got a patch we've made to HBaseStorage which allows a caller to turn
>> off the WriteAheadLog feature while doing bulk loads into hbase.
>>
>> From the performance tuning wikipage:
>> http://wiki.apache.org/hadoop/PerformanceTuning
>> "To speed up the inserts in a non critical job (like an import job), you
>> can
>> use Put.writeToWAL(false) to bypass writing to the write ahead log."
>>
>> We've tested this on HBase 0.20.6 and it helps dramatically.  It sounds
>> like
>> future versions of HBase support a feature like this by default--so maybe
>> this problem goes away when we start using 0.90?
>>
>> Is this something valuable to contribute back?
>>
>
>