Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - populating xml data in hive


+
iwannaplay games 2012-11-20, 07:59
+
Nitin Pawar 2012-11-20, 08:33
+
iwannaplay games 2012-11-20, 08:59
Copy link to this message
-
Re: populating xml data in hive
Nitin Pawar 2012-11-20, 09:03
You can simply write a mapreduce job which will do the job for you
That will be readily available for hive table
On Nov 20, 2012 2:29 PM, "iwannaplay games" <[EMAIL PROTECTED]>
wrote:

> How to preprocess data where millions of records are there out of
> which only few thousands contain xml data
>
>
> On 11/20/12, Nitin Pawar <[EMAIL PROTECTED]> wrote:
> > Hive currently supports only new line as record separator. If you got
> > newline in in column values then you will need to preprocess your data
> and
> > remove new line from column values
> > On Nov 20, 2012 1:30 PM, "iwannaplay games" <[EMAIL PROTECTED]>
> > wrote:
> >
> >> Hi All,
> >>
> >> I have a csv file ( separated by |) where data is like
> >>
> >> id               data
> >>                                        date
> >> 1            apple
> >>                                   24-nov-2011
> >> 2            mango
> >>                                 26-nov-2011
> >> 3            <?xml version="1.0" encoding="utf-8"?>
> >>                  <a>fruits</a>
> >>                                 28-nov-2011
> >> 4             papaya
> >>                                  30-nov-2011
> >>
> >>
> >> Since id=3 has new line in data field hive  takes only first
> >> line and treats second line as different row.I want my full xml field
> >> to be taken inside data in hive table .
> >>
> >> it seems hive doesnt support            lines terminated by '|'
> >>
> >> How to treat xml data in hive
> >>
> >> Thanks & Regards
> >> Prabhjot
> >>
> >
>
+
Bejoy KS 2012-11-20, 09:03