Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Problem with xml data in hbase bulk loading


+
iwannaplay games 2012-11-20, 07:54
+
Jean-Marc Spaggiari 2012-11-20, 11:59
Copy link to this message
-
Re: Problem with xml data in hbase bulk loading
Hi,

There are two option:
1. Fix the input file so that one line contains an entire record.
2. Write a custom input format to read record which spans multiple lines.
If you do this then you will need to write a custom Mapper also. For
reference implementation of custom mapper you can have a look at ImportTsv
class in HBase.

HTH,
Anil Gupta
On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi
>
> In csv files, new line = new entry :(
>
> So I think your only option is to fix your input file by removing your
> extra lines.
>
> JM
> Le 20 nov. 2012 02:55, "iwannaplay games" <[EMAIL PROTECTED]> a
> écrit :
>
> > Hi All,
> >
> > I have a csv file ( separated by |) where data is like
> >
> > id               data
> >                                date
> > 1            apple
> >                          24-nov-2011
> > 2            mango
> >                        26-nov-2011
> > 3            <?xml version="1.0" encoding="utf-8"?>
> >                  <a>fruits</a>
> >                        28-nov-2011
> > 4             papaya
> >                         30-nov-2011
> >
> >
> > Since id=3 has new line in data field hbase importtsv takes only first
> > line and treats second line as different row.I want my full xml field
> > to be taken inside data in hbase table .
> >
> > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv
> > -Dimporttsv.bulk.output=eve
> > -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date
> > '-Dimporttsv.separator=|'  fruits /fruits/fr
> >
> > How to treat xml data in hbase while doing bulk load
> >
> > Thanks & Regards
> > Prabhjot
> >
>

--
Thanks & Regards,
Anil Gupta
+
iwannaplay games 2012-11-21, 10:38