Phoenix, Hive, Pig, Java would all work.
But to Azury Yu's post...
The OP is doing a simple scan() to get rows.
If the OP is hitting an OOM exception then its a code issue on the part of the OP.
On Jun 27, 2013, at 2:22 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> Sorry, maybe Phonex is not suitable for you.
> On Thu, Jun 27, 2013 at 3:21 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>> 1) Scan.setCaching() to specify the number of rows for caching that will
>> be passed to scanners.
>> and what's your block cache size?
>> but if OOM from the client, not sever side, then I don't think this is
>> Scan related, please check your client code.
>> 2) we cannot add default value from HBase, but you can add it on your
>> client when iterate the Result.
>> Also, you can using Phonex, this is cool for your scenario.
>> On Thu, Jun 27, 2013 at 3:11 PM, Vimal Jain <[EMAIL PROTECTED]> wrote:
>>> I am trying to export from hbase to a CSV file.
>>> I am using "Scan" class to scan all data in the table.
>>> But i am facing some problems while doing it.
>>> 1) My table has around 1.5 million rows and around 150 columns for each
>>> row , so i can not use default scan() constructor as it will scan whole
>>> table in one go which results in OutOfMemory error in client process.I
>>> heard of using setCaching() and setBatch() but i am not able to understand
>>> how it will solve OOM error.
>>> I thought of providing startRow and stopRow in scan object but i want to
>>> scan whole table so how will this help ?
>>> 2) As hbase stores data for a row only when we explicitly provide it and
>>> their is no concept of default value as found in RDBMS , i want to have
>>> each and evey column in the CSV file i generate for every user.In case
>>> column values are not there in hbase , i want to use default values for
>>> them(I have list of default values for each column). Is there any method
>>> Result class or any other class to accomplish this ?
>>> Please help here.
>>> Thanks and Regards,
>>> Vimal Jain