Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Pig 0.8 HBaseStorage patch

Copy link to this message
Re: Pig 0.8 HBaseStorage patch
Okay I've created a JIRA and submitted the patch.  It's my first patch so
please educate me on proper etiquette.


On Wed, Jan 26, 2011 at 12:43 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> Yahoo is not a huge user of Pig and HBase together yet, so my response to
> this is theoretical rather than based on my need.  But if your work produces
> a significant improvement I would definitely say it is worth contributing.
>  Even if it does not get checked in because we migrate the trunk to work
> with the latest HBase (which maybe already has the work in it) it's still
> worthwhile to have the patch in the JIRA so that those who are using Pig
> with older HBase can apply it to their code and get the benefits.
> This functionality should definitely be configurable, since it has
> correctness implications.
> Alan.
> On Jan 24, 2011, at 1:22 PM, Corbin Hoenes wrote:
>  We've got a patch we've made to HBaseStorage which allows a caller to turn
>> off the WriteAheadLog feature while doing bulk loads into hbase.
>> From the performance tuning wikipage:
>> http://wiki.apache.org/hadoop/PerformanceTuning
>> "To speed up the inserts in a non critical job (like an import job), you
>> can
>> use Put.writeToWAL(false) to bypass writing to the write ahead log."
>> We've tested this on HBase 0.20.6 and it helps dramatically.  It sounds
>> like
>> future versions of HBase support a feature like this by default--so maybe
>> this problem goes away when we start using 0.90?
>> Is this something valuable to contribute back?