Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Problems while exporting from Hbase to CSV file


Copy link to this message
-
Problems while exporting from Hbase to CSV file
Hi,
I am trying to export from hbase to a CSV file.
I am using "Scan" class to scan all data  in the table.
But i am facing some problems while doing it.

1) My table has around 1.5 million rows  and around 150 columns for each
row , so i can not use default scan() constructor as it will scan whole
table in one go which results in OutOfMemory error in client process.I
heard of using setCaching() and setBatch() but i am not able to understand
how it will solve OOM error.

I thought of providing startRow and stopRow in scan object but i want to
scan whole table so how will this help ?

2) As hbase stores data for a row only when we explicitly provide it and
their is no concept of default value as found in RDBMS , i want to have
each and evey column in the CSV file i generate for every user.In case
column values are not there in hbase , i want to use default  values for
them(I have list of default values for each column). Is there any method in
Result class or any other class to accomplish this ?
Please help here.

--
Thanks and Regards,
Vimal Jain