Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - run time error during reduce stage: No field named ____ in: null


+
Brian Derickson 2012-11-01, 22:09
+
Dave Beech 2012-11-01, 22:49
Copy link to this message
-
Re: run time error during reduce stage: No field named ____ in: null
Brian Derickson 2012-11-02, 15:55
I've made another gist for this rather than clutter up the mail with code
snippets: https://gist.github.com/4002132

I basically just changed all instances of Pair<GenericRecord, Integer> in
the reducer with just GenericRecord. I also changed the output schema that
gets set in the Main function.

When I run this, I get a run time error that's also included in the above
gist: "java.lang.IllegalArgumentException: Not a Pair schema:"

The pom.xml file I'm using is also in this gist, in case I'm screwing up a
version somewhere. My intent is to be running on CDH4 using MRv1 and Avro
1.7.1, and as far as I can tell from the pom.xml I'm doing just that. Could
be mistaken.

Thanks again for your time.
On Thu, Nov 1, 2012 at 5:49 PM, Dave Beech <[EMAIL PROTECTED]> wrote:

> Hi Brian
>
> I don't think the output from the reducer should be a Pair. You said
> you got an error when you didn't use a Pair here - what was it?
>
> Cheers,
> Dave
>
> On 1 November 2012 22:09, Brian Derickson <[EMAIL PROTECTED]> wrote:
> > I've been pulling my hair out over this all day, and I'm hoping this is
> > something simple I'm overlooking.
> >
> > The relevant portions of my code, the schema I'm using, and the stack
> trace
> > are at https://gist.github.com/3996847.
> >
> > I'm using Hadoop 0.20.2 and Avro 1.7.1 as part of CDH4.
> >
> > To briefly describe what I'm doing: the mapper (not included in the
> gist) is
> > taking a bam file and spitting out some information. The key is the
> > chromosome and position colon delimited and the value is an integer.
> >
> > The reducer is summing up all the integers at a particular position and
> > creating a Pair object containing a record using the schema included in
> my
> > gist. The second portion of the pair is an integer that I don't care
> > about... if I didn't use a Pair here, I'd get an error. If this is
> something
> > I could do differently, please correct me. :)
> >
> > Every time this is run, I get the stack trace included in the gist. I've
> run
> > out of things to try to fix this... I'd really really appreciate any
> help I
> > can get. Thanks!
> >
> >
> >
>
+
Dave Beech 2012-11-02, 16:06
+
Brian Derickson 2012-11-02, 16:25
+
Harsh J 2012-11-02, 17:08