|
|
-
Problem with xml data in hbase bulk loading
iwannaplay games 2012-11-20, 07:54
Hi All,
I have a csv file ( separated by |) where data is like
id data date 1 apple 24-nov-2011 2 mango 26-nov-2011 3 <?xml version="1.0" encoding="utf-8"?> <a>fruits</a> 28-nov-2011 4 papaya 30-nov-2011 Since id=3 has new line in data field hbase importtsv takes only first line and treats second line as different row.I want my full xml field to be taken inside data in hbase table .
./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv -Dimporttsv.bulk.output=eve -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date '-Dimporttsv.separator=|' fruits /fruits/fr
How to treat xml data in hbase while doing bulk load
Thanks & Regards Prabhjot
+
iwannaplay games 2012-11-20, 07:54
-
Re: Problem with xml data in hbase bulk loading
Jean-Marc Spaggiari 2012-11-20, 11:59
Hi
In csv files, new line = new entry :(
So I think your only option is to fix your input file by removing your extra lines.
JM Le 20 nov. 2012 02:55, "iwannaplay games" <[EMAIL PROTECTED]> a écrit :
> Hi All, > > I have a csv file ( separated by |) where data is like > > id data > date > 1 apple > 24-nov-2011 > 2 mango > 26-nov-2011 > 3 <?xml version="1.0" encoding="utf-8"?> > <a>fruits</a> > 28-nov-2011 > 4 papaya > 30-nov-2011 > > > Since id=3 has new line in data field hbase importtsv takes only first > line and treats second line as different row.I want my full xml field > to be taken inside data in hbase table . > > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv > -Dimporttsv.bulk.output=eve > -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date > '-Dimporttsv.separator=|' fruits /fruits/fr > > How to treat xml data in hbase while doing bulk load > > Thanks & Regards > Prabhjot >
+
Jean-Marc Spaggiari 2012-11-20, 11:59
-
Re: Problem with xml data in hbase bulk loading
anil gupta 2012-11-21, 07:44
Hi,
There are two option: 1. Fix the input file so that one line contains an entire record. 2. Write a custom input format to read record which spans multiple lines. If you do this then you will need to write a custom Mapper also. For reference implementation of custom mapper you can have a look at ImportTsv class in HBase.
HTH, Anil Gupta On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote:
> Hi > > In csv files, new line = new entry :( > > So I think your only option is to fix your input file by removing your > extra lines. > > JM > Le 20 nov. 2012 02:55, "iwannaplay games" <[EMAIL PROTECTED]> a > écrit : > > > Hi All, > > > > I have a csv file ( separated by |) where data is like > > > > id data > > date > > 1 apple > > 24-nov-2011 > > 2 mango > > 26-nov-2011 > > 3 <?xml version="1.0" encoding="utf-8"?> > > <a>fruits</a> > > 28-nov-2011 > > 4 papaya > > 30-nov-2011 > > > > > > Since id=3 has new line in data field hbase importtsv takes only first > > line and treats second line as different row.I want my full xml field > > to be taken inside data in hbase table . > > > > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv > > -Dimporttsv.bulk.output=eve > > -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date > > '-Dimporttsv.separator=|' fruits /fruits/fr > > > > How to treat xml data in hbase while doing bulk load > > > > Thanks & Regards > > Prabhjot > > >
-- Thanks & Regards, Anil Gupta
+
anil gupta 2012-11-21, 07:44
-
Re: Problem with xml data in hbase bulk loading
iwannaplay games 2012-11-21, 10:38
Hi all,
I tried importing data by sqoop and it takes xml data in one column only. Regards Prabjot On 11/21/12, anil gupta <[EMAIL PROTECTED]> wrote: > Hi, > > There are two option: > 1. Fix the input file so that one line contains an entire record. > 2. Write a custom input format to read record which spans multiple lines. > If you do this then you will need to write a custom Mapper also. For > reference implementation of custom mapper you can have a look at ImportTsv > class in HBase. > > HTH, > Anil Gupta > > > On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi >> >> In csv files, new line = new entry :( >> >> So I think your only option is to fix your input file by removing your >> extra lines. >> >> JM >> Le 20 nov. 2012 02:55, "iwannaplay games" <[EMAIL PROTECTED]> a >> écrit : >> >> > Hi All, >> > >> > I have a csv file ( separated by |) where data is like >> > >> > id data >> > date >> > 1 apple >> > 24-nov-2011 >> > 2 mango >> > 26-nov-2011 >> > 3 <?xml version="1.0" encoding="utf-8"?> >> > <a>fruits</a> >> > 28-nov-2011 >> > 4 papaya >> > 30-nov-2011 >> > >> > >> > Since id=3 has new line in data field hbase importtsv takes only first >> > line and treats second line as different row.I want my full xml field >> > to be taken inside data in hbase table . >> > >> > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv >> > -Dimporttsv.bulk.output=eve >> > -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date >> > '-Dimporttsv.separator=|' fruits /fruits/fr >> > >> > How to treat xml data in hbase while doing bulk load >> > >> > Thanks & Regards >> > Prabhjot >> > >> > > > > -- > Thanks & Regards, > Anil Gupta >
+
iwannaplay games 2012-11-21, 10:38
|
|