Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Re: PIG script - PIGStorage


Copy link to this message
-
Re: PIG script - PIGStorage
Jonathan Coveney 2012-12-10, 18:33
The default loader can't handle this. You  would need a custom InputFormat,
which isn't too bad.
2012/12/9 L N <[EMAIL PROTECTED]>

> Hi,
>
>
>
> > I have an unstructured file format. Assume below is the data in a file
> >
> > <x1,value1 ><y <x2,values> <x3,value3> > <x4, value 4> <x5, value5>
> >
>      abxcd xyxc
>
> > <x6, value6>
> > <x7,value7>
> >
> > I need to process the data in between < > only and neglect other
> characters
>
> >
> > How do i write a pig script like below for loading the data in above file
> > log =LOAD 'input'  USING PigStorage(' ')
> >
> > What would be the delimiter here that needs to be used for PigStorage and
> > how should i specify variable names. What else I need to take care
> >
> >
> > Thanks
> >
>