Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Loading text files from local file system


Copy link to this message
-
Re: Loading text files from local file system
Have you considered using hfile.compression, perhaps with snappy
compression?
See this thread:
http://grokbase.com/t/hbase/user/10cqrd06pc/hbase-bulk-load-script
--Suraj

On Tue, Apr 16, 2013 at 9:31 PM, Omkar Joshi <[EMAIL PROTECTED]>wrote:

> The background thread is here :
>
>
> http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%[EMAIL PROTECTED]%3E
>
> Following are the commands that I'm using to load files onto HBase :
>
> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv
> '-Dimporttsv.separator=;'
> -Dimporttsv.columns=HBASE_ROW_KEY,PRODUCT_INFO:NAME,PRODUCT_INFO:CATEGORY,PRODUCT_INFO:GROUP,PRODUCT_INFO:COMPANY,PRODUCT_INFO:COST,PRODUCT_INFO:COLOR,PRODUCT_INFO:BLANK_COLUMN
> -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6
> PRODUCTS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/product_6.txt
>
> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
> ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar
> completebulkload hdfs://cldx-1139-1033:9000/hbase/storefileoutput_6 PRODUCTS
>
> As seen, the text files to be loaded in HBase first need to be loaded on
> HDFS. Given our infrastructure constraints/limitations, I'm getting space
> issues. The data in the text files is around 20GB + replication is
> consuming a lot of DFS.
>
> Is there a way wherein a text file can be loaded directly from the local
> file system onto HBase?
>
> Regards,
> Omkar Joshi
>
> ________________________________
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB