Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Unnecessary file copying during the bulkload: should we backport the fix in 0.96?


Copy link to this message
-
Re: Unnecessary file copying during the bulkload: should we backport the fix in 0.96?
Thanks Ted. It's great that the fix is already backported.

On Thu, Sep 6, 2012 at 7:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> The fix, HBASE-6529, would be in 0.94.2
>
> Expect 0.94.2 RC0 next week.
>
> Thanks
>
> On Thu, Sep 6, 2012 at 5:30 PM, pig user <[EMAIL PROTECTED]> wrote:
>
>> In HBase 0.94, bulkload would always copy the files:
>>
>> // Move the file if it's on another filesystem
>> FileSystem srcFs = srcPath.getFileSystem(conf);
>> if (!srcFs.equals(fs)) {
>>    LOG.info("File " + srcPath + " on different filesystem than " +
>>       "destination store - moving to this filesystem.");
>>     ......
>>
>> Since fs here is an instance of HFileSystem. This would result in the
>> load taking long time to complete if the HFiles are in the destination
>> cluster.
>>
>> This is fixed in trunk:
>>
>> FileSystem srcFs = srcPath.getFileSystem(conf);
>> FileSystem desFs = fs instanceof HFileSystem ?
>> ((HFileSystem)fs).getBackingFs() : fs;
>>     if (!srcFs.equals(desFs)) {
>>     ... ...
>>
>> My question is: should we back port the fix to 0.94?
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB