Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> PigStorage


hi all,

I'm using Pig 0.9.2 (Apache Pig version 0.9.2-cdh4.0.1, precisely)
I got a case today on which I needed to clean up some fields before
processing. I will need to do the same for all my scripts. So instead of
doing it inside the scripts, I thought in extending PigStorage and do it
inside my own Loader. My scripts will be shorter and cleaner

in fact, the only method that I needed to overwrite was :
void *readField*(byte[] buf, int start, int end)
Everything was ok and it is working. Problem was that I had to copy/paste a
lot just because private declarations
for example:
  private byte fieldDel = '\t';
  private ArrayList<Object> mProtoTuple = null;
  private TupleFactory mTupleFactory = TupleFactory.getInstance();
  private boolean mRequiredColumnsInitialized = false;

and of course:
*private *void readField(byte[] buf, int start, int end)

so I had to copy/paste:
public Tuple getNext() and all the aforementioned variables just to be able
to write my own *readField*
would it be possible in next versions of Pig to have *readField *protected
as well as *mProtoTuple *? I think it could be useful in some cases like
mine
I'm asking because I don't know the reasoning after the decisions of made
them private

thanks a lot,
+
Dmitriy Ryaboy 2012-11-16, 22:15
+
Bill Graham 2012-11-19, 17:16
+
pablomar 2012-11-19, 17:24
+
pablomar 2012-11-19, 21:17
+
Jonathan Coveney 2012-11-19, 23:32
+
pablomar 2012-11-20, 00:38
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB