Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> PigStorage


+
pablomar 2012-11-16, 20:48
+
Dmitriy Ryaboy 2012-11-16, 22:15
Copy link to this message
-
Re: PigStorage
+1 as well, but I'd suggest we do the following:

- Keep mProtoTuple private and add protected getters/setters instead with
javadocs describing expected usage.
- Rename mProtoTuple and the getters/setters to something more descriptive
than mProtoTuple.
On Fri, Nov 16, 2012 at 2:15 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:

> That sounds reasonable, I've run into the same problem. Do you mind
> submitting a patch?
>
> On Fri, Nov 16, 2012 at 12:48 PM, pablomar
> <[EMAIL PROTECTED]> wrote:
> > hi all,
> >
> > I'm using Pig 0.9.2 (Apache Pig version 0.9.2-cdh4.0.1, precisely)
> > I got a case today on which I needed to clean up some fields before
> > processing. I will need to do the same for all my scripts. So instead of
> > doing it inside the scripts, I thought in extending PigStorage and do it
> > inside my own Loader. My scripts will be shorter and cleaner
> >
> > in fact, the only method that I needed to overwrite was :
> > void *readField*(byte[] buf, int start, int end)
> >
> >
> > Everything was ok and it is working. Problem was that I had to
> copy/paste a
> > lot just because private declarations
> > for example:
> >   private byte fieldDel = '\t';
> >   private ArrayList<Object> mProtoTuple = null;
> >   private TupleFactory mTupleFactory = TupleFactory.getInstance();
> >   private boolean mRequiredColumnsInitialized = false;
> >
> > and of course:
> > *private *void readField(byte[] buf, int start, int end)
> >
> > so I had to copy/paste:
> > public Tuple getNext() and all the aforementioned variables just to be
> able
> > to write my own *readField*
> >
> >
> > would it be possible in next versions of Pig to have *readField
> *protected
> > as well as *mProtoTuple *? I think it could be useful in some cases like
> > mine
> > I'm asking because I don't know the reasoning after the decisions of made
> > them private
> >
> > thanks a lot,
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*
+
pablomar 2012-11-19, 17:24
+
pablomar 2012-11-19, 21:17
+
Jonathan Coveney 2012-11-19, 23:32
+
pablomar 2012-11-20, 00:38
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB