Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> populating xml data in hive

Copy link to this message
Re: populating xml data in hive
You can use your custom mapreduce code. Just check the record type and if xml then preprocess to avoid new lines.

Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: iwannaplay games <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2012 14:29:18
Subject: Re: populating xml data in hive

How to preprocess data where millions of records are there out of
which only few thousands contain xml data
On 11/20/12, Nitin Pawar <[EMAIL PROTECTED]> wrote:
> Hive currently supports only new line as record separator. If you got
> newline in in column values then you will need to preprocess your data and
> remove new line from column values
> On Nov 20, 2012 1:30 PM, "iwannaplay games" <[EMAIL PROTECTED]>
> wrote:
>> Hi All,
>> I have a csv file ( separated by |) where data is like
>> id               data
>>                                        date
>> 1            apple
>>                                   24-nov-2011
>> 2            mango
>>                                 26-nov-2011
>> 3            <?xml version="1.0" encoding="utf-8"?>
>>                  <a>fruits</a>
>>                                 28-nov-2011
>> 4             papaya
>>                                  30-nov-2011
>> Since id=3 has new line in data field hive  takes only first
>> line and treats second line as different row.I want my full xml field
>> to be taken inside data in hive table .
>> it seems hive doesnt support            lines terminated by '|'
>> How to treat xml data in hive
>> Thanks & Regards
>> Prabhjot