Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Problems while exporting from Hbase to CSV file


Copy link to this message
-
Problems while exporting from Hbase to CSV file
Hi,
I am trying to export from hbase to a CSV file.
I am using "Scan" class to scan all data  in the table.
But i am facing some problems while doing it.

1) My table has around 1.5 million rows  and around 150 columns for each
row , so i can not use default scan() constructor as it will scan whole
table in one go which results in OutOfMemory error in client process.I
heard of using setCaching() and setBatch() but i am not able to understand
how it will solve OOM error.

I thought of providing startRow and stopRow in scan object but i want to
scan whole table so how will this help ?

2) As hbase stores data for a row only when we explicitly provide it and
their is no concept of default value as found in RDBMS , i want to have
each and evey column in the CSV file i generate for every user.In case
column values are not there in hbase , i want to use default  values for
them(I have list of default values for each column). Is there any method in
Result class or any other class to accomplish this ?
Please help here.

--
Thanks and Regards,
Vimal Jain
+
Azuryy Yu 2013-06-27, 07:21
+
Azuryy Yu 2013-06-27, 07:22
+
Michael Segel 2013-06-27, 22:32
+
Anoop John 2013-06-28, 12:23
+
Michael Segel 2013-06-28, 12:45
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB