Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Pig 0.8 HBaseStorage patch


Copy link to this message
-
Re: Pig 0.8 HBaseStorage patch
Okay I've created a JIRA and submitted the patch.  It's my first patch so
please educate me on proper etiquette.

https://issues.apache.org/jira/browse/PIG-1825

On Wed, Jan 26, 2011 at 12:43 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> Yahoo is not a huge user of Pig and HBase together yet, so my response to
> this is theoretical rather than based on my need.  But if your work produces
> a significant improvement I would definitely say it is worth contributing.
>  Even if it does not get checked in because we migrate the trunk to work
> with the latest HBase (which maybe already has the work in it) it's still
> worthwhile to have the patch in the JIRA so that those who are using Pig
> with older HBase can apply it to their code and get the benefits.
>
> This functionality should definitely be configurable, since it has
> correctness implications.
>
> Alan.
>
>
> On Jan 24, 2011, at 1:22 PM, Corbin Hoenes wrote:
>
>  We've got a patch we've made to HBaseStorage which allows a caller to turn
>> off the WriteAheadLog feature while doing bulk loads into hbase.
>>
>> From the performance tuning wikipage:
>> http://wiki.apache.org/hadoop/PerformanceTuning
>> "To speed up the inserts in a non critical job (like an import job), you
>> can
>> use Put.writeToWAL(false) to bypass writing to the write ahead log."
>>
>> We've tested this on HBase 0.20.6 and it helps dramatically.  It sounds
>> like
>> future versions of HBase support a feature like this by default--so maybe
>> this problem goes away when we start using 0.90?
>>
>> Is this something valuable to contribute back?
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB