Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Converting xml to csv


Copy link to this message
-
Re: Converting xml to csv
ajay kumar 2013-09-12, 06:35
use org.apache.pig.piggybank.storage.XMLLoader  and then extract them using
regex_all
On Thu, Sep 12, 2013 at 11:18 AM, jamal sasha <[EMAIL PROTECTED]> wrote:

> Umm.. yess.. but how do i generalize it..
> so what I am looking for is.. just like we have json parser in say java
> If i give a valid json string.. I can parse it as and then i can access it
> as a hashmap..
> But in xml loader.. i still have to specify regex rules??
>
> Actually, is it possible to just flatten the xml..
> so for example
> convert
> <aux>
> <foobar>1</foobar>
> <fushbar>foo</fushbar>
> </aux>
> to
> <aux><foobar>1</foobar><fushbar>foo</fushbar></aux>
> ???
>
>
>
>
> On Wed, Sep 11, 2013 at 10:32 PM, Jagat Singh <[EMAIL PROTECTED]>
> wrote:
>
> > Use piggybank xmlloader
> >  On 12/09/2013 10:14 AM, "jamal sasha" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi,
> > >   So I have different xml data sources...For example:
> > >
> > > src1.txt
> > >
> > > <foo>
> > > <bar>1</bar>
> > > </foo>
> > > <foo>
> > > <bar>2</bar>
> > > </foo>
> > > .. and so on
> > >
> > >
> > > and another data
> > >
> > > src2.txt
> > >
> > > <aux>
> > > <foobar>1</foobar>
> > > <fushbar>foo</fushbar>
> > > </aux>
> > >
> > > ... and so on
> > >
> > >
> > > So basicaly different xml (valid formats)
> > >
> > > Rather than writing different pig scripts.. is there a way to write 1
> > > script and then convert all these xml data into csv?
> > > Thanks
> > >
> >
>

--
*Thanks & Regards,*
*S. Ajay Kumar
+91-9966159106*