Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Problem with xml data in hbase bulk loading


Copy link to this message
-
Re: Problem with xml data in hbase bulk loading
Hi,

There are two option:
1. Fix the input file so that one line contains an entire record.
2. Write a custom input format to read record which spans multiple lines.
If you do this then you will need to write a custom Mapper also. For
reference implementation of custom mapper you can have a look at ImportTsv
class in HBase.

HTH,
Anil Gupta
On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi
>
> In csv files, new line = new entry :(
>
> So I think your only option is to fix your input file by removing your
> extra lines.
>
> JM
> Le 20 nov. 2012 02:55, "iwannaplay games" <[EMAIL PROTECTED]> a
> écrit :
>
> > Hi All,
> >
> > I have a csv file ( separated by |) where data is like
> >
> > id               data
> >                                date
> > 1            apple
> >                          24-nov-2011
> > 2            mango
> >                        26-nov-2011
> > 3            <?xml version="1.0" encoding="utf-8"?>
> >                  <a>fruits</a>
> >                        28-nov-2011
> > 4             papaya
> >                         30-nov-2011
> >
> >
> > Since id=3 has new line in data field hbase importtsv takes only first
> > line and treats second line as different row.I want my full xml field
> > to be taken inside data in hbase table .
> >
> > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv
> > -Dimporttsv.bulk.output=eve
> > -Dimporttsv.columns=HBASE_ROW_KEY,el:data,el:Date
> > '-Dimporttsv.separator=|'  fruits /fruits/fr
> >
> > How to treat xml data in hbase while doing bulk load
> >
> > Thanks & Regards
> > Prabhjot
> >
>

--
Thanks & Regards,
Anil Gupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB