|
|
-
Re: PIG script - PIGStorage
L N 2012-12-09, 16:51
Hi,
> I have an unstructured file format. Assume below is the data in a file > > <x1,value1 ><y <x2,values> <x3,value3> > <x4, value 4> <x5, value5> > abxcd xyxc
> <x6, value6> > <x7,value7> > > I need to process the data in between < > only and neglect other characters
> > How do i write a pig script like below for loading the data in above file > log =LOAD 'input' USING PigStorage(' ') > > What would be the delimiter here that needs to be used for PigStorage and > how should i specify variable names. What else I need to take care > > > Thanks >
-
Re: PIG script - PIGStorage
Jonathan Coveney 2012-12-10, 18:33
The default loader can't handle this. You would need a custom InputFormat, which isn't too bad. 2012/12/9 L N <[EMAIL PROTECTED]>
> Hi, > > > > > I have an unstructured file format. Assume below is the data in a file > > > > <x1,value1 ><y <x2,values> <x3,value3> > <x4, value 4> <x5, value5> > > > abxcd xyxc > > > <x6, value6> > > <x7,value7> > > > > I need to process the data in between < > only and neglect other > characters > > > > > How do i write a pig script like below for loading the data in above file > > log =LOAD 'input' USING PigStorage(' ') > > > > What would be the delimiter here that needs to be used for PigStorage and > > how should i specify variable names. What else I need to take care > > > > > > Thanks > > >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext