Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Loading text files from local file system

Copy link to this message
Re: Loading text files from local file system
Maybe I misunderstood your constraint ... are you saying that your DFS
itself is having constraint due to file size & replication? If so, how
about setting dfs.replication to 1 for the job?

There are other options like chopping up your file and processing it
piecemeal ... or perhaps customizing LoadIncrementalFiles to process
compressed input files and so forth ...

See if the dfs.replication + hfile.compression option works for you first.

On Wed, Apr 17, 2013 at 1:00 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:

> Have you considered using hfile.compression, perhaps with snappy
> compression?
> See this thread:
> http://grokbase.com/t/hbase/user/10cqrd06pc/hbase-bulk-load-script
> --Suraj
> On Tue, Apr 16, 2013 at 9:31 PM, Omkar Joshi <[EMAIL PROTECTED]>wrote:
>> The background thread is here :
>> http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%[EMAIL PROTECTED]%3E
>> Following are the commands that I'm using to load files onto HBase :
>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
>> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase- importtsv
>> '-Dimporttsv.separator=;'
>> -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6
>> PRODUCTS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/product_6.txt
>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
>> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-
>> completebulkload hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6 PRODUCTS
>> As seen, the text files to be loaded in HBase first need to be loaded on
>> HDFS. Given our infrastructure constraints/limitations, I'm getting space
>> issues. The data in the text files is around 20GB + replication is
>> consuming a lot of DFS.
>> Is there a way wherein a text file can be loaded directly from the local
>> file system onto HBase?
>> Regards,
>> Omkar Joshi
>> ________________________________
>> The contents of this e-mail and any attachment(s) may contain
>> confidential or privileged information for the intended recipient(s).
>> Unintended recipients are prohibited from taking action on the basis of
>> information in this e-mail and using or disseminating the information, and
>> must notify the sender and delete it from their system. L&T Infotech will
>> not accept responsibility or liability for the accuracy or completeness of,
>> or the presence of any virus or disabling code in this e-mail"