Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - RE: Inserting Data from CSV into HBase


+
Savant, Keshav 2012-03-06, 10:02
+
Harsh J 2012-03-06, 11:59
+
Savant, Keshav 2012-03-06, 12:49
+
victor.hong@... 2012-03-06, 14:37
Copy link to this message
-
Re: Inserting Data from CSV into HBase
Anil Gupta 2012-03-06, 23:58
Hi keshav,

Seemingly there is a problem with bulk load when we try to import data from csv file. I also ran into this problem yesterday and posted the same on mailing list. I got pulled into some other task at work so unable to devote much time on it. I have identified the problem but I still need to figure out the fix of it. I will post the solution once I finish it.

Best Regards,
Anil

On Mar 6, 2012, at 6:37 AM, <[EMAIL PROTECTED]> wrote:

> Did you try to add a comma at the end of line? Just to see how it will do?
>
>
> On Mar 6, 2012, at 5:02 AM, ext Savant, Keshav wrote:
>
>> Hi,
>>
>> I tried bulk uploading and it ran well with TSV files, we first ran importtsv and then completebulkload, after doing these two steps I can scan my HBase table and see the data. I can also see the data when I traverse HDFS of my Hadoop cluster using web browser.
>>
>> But when I try to upload my CSVs in a folder, I get bad lines for all the lines of my CSV files. I use following command to upload my CSVs on my local file system to HDFS,
>>
>> HADOOP_CLASSPATH=`hbase classpath` $HADOOP_HOME/bin/hadoop jar /hbase_home/hbase-0.92.0/hbase-0.92.0.jar importtsv  -Dimporttsv.bulk.output=/my_output_dir -Dimporttsv.columns=HBASE_ROW_KEY,SerialNumber,Column1,Column2 my_table file:/my_csv/data.txt '-Dimporttsv.separator=,'
>>
>> my csv file is of following format
>>
>> 1,data11,data12
>> 2,data21,data22
>> 3,data31,data32
>> .....
>> .....
>>
>> And my HBase table has 3 columns
>>
>>
>> Please let me know what is the exact problem and how this can be resolved?
>>
>> Kind regards,
>> Keshav
>>
>>
>>
>> -----Original Message-----
>> From: Savant, Keshav
>> Sent: Friday, March 02, 2012 7:02 PM
>> To: [EMAIL PROTECTED]
>> Cc: '[EMAIL PROTECTED]'
>> Subject: RE: Inserting Data from CSV into HBase
>>
>> Hi Harsh,
>>
>> Thanks for your response, I don't get any error using the code mentioned in that URL. I will get back to you after analyzing the tools suggested by you.
>> Thanks again.
>>
>>
>> Kind regards,
>> Keshav C Savant
>>
>> -----Original Message-----
>> From: Harsh J [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, March 02, 2012 6:51 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: Inserting Data from CSV into HBase
>>
>> Hi,
>>
>> You may use the importtsv tool and the bulk-load utilities in HBase to achieve this fast-and-easy.
>>
>> This is detailed at http://hbase.apache.org/bulk-loads.html (See section about importtsv along the bottom) and also under section "Using the importtsv tool" on Page 460 of Lars George's "HBase: The Definitive Guide" (O'Reilly).
>>
>> Also when you say something didn't work, please also supply any errors you encountered and the configuration you used. Its hard to help without those.
>>
>> On Fri, Mar 2, 2012 at 6:24 PM, Savant, Keshav <[EMAIL PROTECTED]> wrote:
>>> Hi All,
>>>
>>> I am looking for a way so that I can map my existing CSV file to HBase table, basically for each column family I want only one value (just like RDBMS).
>>>
>>> Just to illustrate more suppose I define a HBase table as
>>>
>>> create 'inventory', 'item', 'supplier', 'quantity'
>>> (here table name is inventory and it has three columns named as item,
>>> supplier and quantity)
>>>
>>> Now I want to load my N number of CSVs in following format into this
>>> HBase table
>>>
>>> Burger,abc confectionary,100
>>> Pizza,xyz bakers,50
>>> ...
>>> ...
>>> ...
>>>
>>> Here I want to put the data of CSV into my inventory table on HBase, the number of lines in a CSV and even number of CSVs are dynamic, and this will be a continuous process.
>>>
>>> What I want to know that, do we have any way by which we can achieve above goal, I tried SampleUploader as specified on http://svn.apache.org/repos/asf/hbase/trunk/src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce/SampleUploader.java, but it did not worked and data does not gets populated in HBase table though the program ran successfully.
>>>
>>> Please suggest on this, any help is appreciated.
+
Savant, Keshav 2012-03-07, 06:59
+
Savant, Keshav 2012-03-02, 12:54
+
Harsh J 2012-03-02, 13:21
+
Savant, Keshav 2012-03-02, 13:31