Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Loading text files from local file system


+
Omkar Joshi 2013-04-17, 04:31
+
Surendra , Manchikanti 2013-04-17, 05:19
+
Suraj Varma 2013-04-17, 08:00
Copy link to this message
-
Re: Loading text files from local file system
Maybe I misunderstood your constraint ... are you saying that your DFS
itself is having constraint due to file size & replication? If so, how
about setting dfs.replication to 1 for the job?

There are other options like chopping up your file and processing it
piecemeal ... or perhaps customizing LoadIncrementalFiles to process
compressed input files and so forth ...

See if the dfs.replication + hfile.compression option works for you first.
--Suraj

On Wed, Apr 17, 2013 at 1:00 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:

> Have you considered using hfile.compression, perhaps with snappy
> compression?
> See this thread:
> http://grokbase.com/t/hbase/user/10cqrd06pc/hbase-bulk-load-script
> --Suraj
>
>
>
> On Tue, Apr 16, 2013 at 9:31 PM, Omkar Joshi <[EMAIL PROTECTED]>wrote:
>
>> The background thread is here :
>>
>>
>> http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%[EMAIL PROTECTED]%3E
>>
>> Following are the commands that I'm using to load files onto HBase :
>>
>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
>> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv
>> '-Dimporttsv.separator=;'
>> -Dimporttsv.columns=HBASE_ROW_KEY,PRODUCT_INFO:NAME,PRODUCT_INFO:CATEGORY,PRODUCT_INFO:GROUP,PRODUCT_INFO:COMPANY,PRODUCT_INFO:COST,PRODUCT_INFO:COLOR,PRODUCT_INFO:BLANK_COLUMN
>> -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6
>> PRODUCTS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/product_6.txt
>>
>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
>> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar
>> completebulkload hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6 PRODUCTS
>>
>> As seen, the text files to be loaded in HBase first need to be loaded on
>> HDFS. Given our infrastructure constraints/limitations, I'm getting space
>> issues. The data in the text files is around 20GB + replication is
>> consuming a lot of DFS.
>>
>> Is there a way wherein a text file can be loaded directly from the local
>> file system onto HBase?
>>
>> Regards,
>> Omkar Joshi
>>
>> ________________________________
>> The contents of this e-mail and any attachment(s) may contain
>> confidential or privileged information for the intended recipient(s).
>> Unintended recipients are prohibited from taking action on the basis of
>> information in this e-mail and using or disseminating the information, and
>> must notify the sender and delete it from their system. L&T Infotech will
>> not accept responsibility or liability for the accuracy or completeness of,
>> or the presence of any virus or disabling code in this e-mail"
>>
>
>
+
Omkar Joshi 2013-04-17, 09:11