Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - XMLLoader


Copy link to this message
-
RE: XMLLoader
Vivek Padmanabhan 2011-03-01, 05:38
Hi Baraa,

Considering the input
<a> <property>myvalue</property> </a>

The output of the below script :-
A = load 'input' using org.apache.pig.piggybank.storage.XMLLoader('property') as (doc:chararray);
dump A;

will look like
(<property>myvalue</property>)

As of now the loader does not support for attributes.

One suggestion is ; You can use xmlloader to load by the parent tag (Dicom) and use a udf to parse values in attr .

Thanks and Regards
 Vivek

-----Original Message-----
From: Baraa Mohamad [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 01, 2011 1:34 AM
To: [EMAIL PROTECTED]
Subject: Re: XMLLoader

thank you very much for your answer, but in fact I cannot see how that could
help me.

what I want to do with XMLLoader is that supposing we have the following xml
file
<Dicom>
      <attr tag="00020000" vr="UL" len="4">180</attr>
       <attr tag="00020001" vr="OB" len="2">00\01</attr>
</Dicom>

if I want for example filter the files that contaim the tag==00020000 and
attr==180

can I do that ?
is the xmlloader will consider all the attribute line( <attr tag="00020000"
vr="UL" len="4">180</attr> )  as a string , so I have to write a dedicated
parser in order to read the values?

please if you have any examples that uses XMLLoader ,they could help me

thank you

regards
On Mon, Feb 28, 2011 at 8:38 PM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote:

> Does this help -
> https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999315#comment-12999315
>
> Santhosh
>
> -----Original Message-----
> From: Baraa Mohamad [mailto:[EMAIL PROTECTED]]
> Sent: Saturday, February 26, 2011 3:24 PM
> To: [EMAIL PROTECTED]
> Subject: Re: XMLLoader
>
> Please is there anyone woking on XMLLoader; please if you have any
> examples, documentation or anything that could help me to work perfecty with
> this function; that will be very helpful
>
> Kind regards
>
>
>
> On Tue, Feb 22, 2011 at 3:59 PM, Baraa Mohamad <
> [EMAIL PROTECTED]
> > wrote:
>
> > Hi all
> >
> > if I have the following XML file
> >
> > <attr tag="00020000" vr="UL" len="4">180</attr> <attr tag="00020001"
> > vr="OB" len="2">00\01</attr>
> >
> > *how I can read it using xmlloader, I mean how I can read for examlpe
> > the value of tag and vr which are inside the attr attribute *?
> >
> > I already wrote the following
> >
> > A = load 'dicoms/' using
> > org.apache.pig.piggybank.storage.XMLLoader('attr')
> > as (x:chararray);
> >
> > But that will consider all the line as a chararray so how i can read
> > the values of tag, vr and attr ??
> >
> > best regards
> >
> >
> >
>