Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PigStorage


That sounds reasonable, I've run into the same problem. Do you mind
submitting a patch?

On Fri, Nov 16, 2012 at 12:48 PM, pablomar
<[EMAIL PROTECTED]> wrote:
> hi all,
>
> I'm using Pig 0.9.2 (Apache Pig version 0.9.2-cdh4.0.1, precisely)
> I got a case today on which I needed to clean up some fields before
> processing. I will need to do the same for all my scripts. So instead of
> doing it inside the scripts, I thought in extending PigStorage and do it
> inside my own Loader. My scripts will be shorter and cleaner
>
> in fact, the only method that I needed to overwrite was :
> void *readField*(byte[] buf, int start, int end)
>
>
> Everything was ok and it is working. Problem was that I had to copy/paste a
> lot just because private declarations
> for example:
>   private byte fieldDel = '\t';
>   private ArrayList<Object> mProtoTuple = null;
>   private TupleFactory mTupleFactory = TupleFactory.getInstance();
>   private boolean mRequiredColumnsInitialized = false;
>
> and of course:
> *private *void readField(byte[] buf, int start, int end)
>
> so I had to copy/paste:
> public Tuple getNext() and all the aforementioned variables just to be able
> to write my own *readField*
>
>
> would it be possible in next versions of Pig to have *readField *protected
> as well as *mProtoTuple *? I think it could be useful in some cases like
> mine
> I'm asking because I don't know the reasoning after the decisions of made
> them private
>
> thanks a lot,