Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> getWrappedSplit() is incorrectly returning the first split


+
Alex Rovner 2012-01-06, 21:50
+
Alex Rovner 2012-01-07, 06:21
+
Daniel Dai 2012-01-07, 23:14
+
Alex Rovner 2012-01-09, 18:48
+
Daniel Dai 2012-01-09, 19:06
+
Aniket Mokashi 2012-01-10, 00:44
+
Prashant Kommireddi 2012-01-10, 00:58
+
Jonathan Coveney 2012-01-10, 01:07
+
Alex Rovner 2012-01-10, 05:10
Copy link to this message
-
Re: getWrappedSplit() is incorrectly returning the first split
The change was added as part of PIG-1518. It has release notes-

"This change will not cause any backward compatibility issue except if a
loader implementation makes use of the PigSplit object passed through the
prepareToRead method where a rebuild of the loader might be necessary as
PigSplit's definition has been modified. However, currently we know of no
external use of the object.

This change also requires the loader to be stateless across the invocations
to the prepareToRead method. That is, the method should reset any internal
states that are not affected by the RecordReader argument.
Otherwise, this feature should be disabled.

It looks like returning 0th split was done deliberately. Comments?

Thanks,
Aniket

On Mon, Jan 9, 2012 at 9:10 PM, Alex Rovner <[EMAIL PROTECTED]> wrote:

> I have already created the patch and tested with some of my jobs. I ran
> into unit tests failure issues though as well. I can attach the patch to
> Jira tomorrow anyways to be applied once things are straightened out.
>
> Alex R
>
> On Mon, Jan 9, 2012 at 8:07 PM, Jonathan Coveney <[EMAIL PROTECTED]>
> wrote:
>
> > If it is affecting production jobs, I see no reason why we can't put the
> > fix into 0.9.2, though I sense that a vote will be coming soon for a
> 0.9.2
> > release, so a fix would have to come soon..the issues running the tests
> > brought up in Bill's thread will have to be fixed before we can, though.
> I
> > have a patch that's completely stopped because I can develop any new
> tests,
> > and so on.
> >
> > 2012/1/9 Prashant Kommireddi <[EMAIL PROTECTED]>
> >
> > > Is this critical enough to make it back into 0.9.1?
> > >
> > > -Prashant
> > >
> > > On Mon, Jan 9, 2012 at 4:44 PM, Aniket Mokashi <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Thanks so much for finding this out.
> > > >
> > > > I was using
> > > >
> > > > @Override
> > > >
> > > > public void prepareToRead(@SuppressWarnings("rawtypes")
> > > > RecordReaderreader, PigSplit split)
> > > >
> > > >  throws IOException {
> > > >
> > > >  this.in = reader;
> > > >
> > > >  partValues > > > >
> > > >
> > >
> >
> ((DataovenSplit)split.getWrappedSplit()).getPartitionInfo().getPartitionValues();
> > > >
> > > >
> > > > in my loader that behaves like hcatalog for delimited text in hive.
> > That
> > > > returns me same partvalues for all the values. I hacked it with
> > something
> > > > else. But, I think I must have hit this case. I will confirm. Thanks
> > > again
> > > > for reporting this.
> > > >
> > > > Thanks,
> > > >
> > > > Aniket
> > > >
> > > > On Mon, Jan 9, 2012 at 11:06 AM, Daniel Dai <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Yes, please. Thanks!
> > > > >
> > > > > On Mon, Jan 9, 2012 at 10:48 AM, Alex Rovner <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >
> > > > > > Jira opened.
> > > > > >
> > > > > > I can attempt to submit a patch as this seems like a fairly
> > straight
> > > > > > forward fix.
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/PIG-2462
> > > > > >
> > > > > >
> > > > > > Thanks
> > > > > > Alex R
> > > > > >
> > > > > > On Sat, Jan 7, 2012 at 6:14 PM, Daniel Dai <
> [EMAIL PROTECTED]>
> > > > > wrote:
> > > > > >
> > > > > > > Sounds like a bug. I guess no one ever rely on specific split
> > info
> > > > > > before.
> > > > > > > Please open a Jira.
> > > > > > >
> > > > > > > Daniel
> > > > > > >
> > > > > > > On Fri, Jan 6, 2012 at 10:21 PM, Alex Rovner <
> > [EMAIL PROTECTED]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Additionally it looks like PigRecordReader is not
> incrementing
> > > the
> > > > > > index
> > > > > > > in
> > > > > > > > the PigSplit when dealing with CombinedInputFormat thus the
> > index
> > > > > will
> > > > > > be
> > > > > > > > incorrect in either case.
> > > > > > > >
> > > > > > > > On Fri, Jan 6, 2012 at 4:50 PM, Alex Rovner <
> > > [EMAIL PROTECTED]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ran into this today. Using trunk (0.11)

"...:::Aniket:::... Quetzalco@tl"
+
Daniel Dai 2012-01-10, 09:22
+
Alex Rovner 2012-01-10, 13:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB