Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - JsonLoader schema field order shouldn't matter


+
Tim Sell 2013-01-07, 19:56
+
meghana narasimhan 2013-01-07, 21:32
+
Tim Sell 2013-01-08, 01:03
+
Alan Gates 2013-01-07, 20:24
+
Tim Sell 2013-01-08, 01:02
+
Alan Gates 2013-01-08, 17:38
Copy link to this message
-
Re: JsonLoader schema field order shouldn't matter
Dmitriy Ryaboy 2013-01-11, 03:35
Tim, can you open a github issue with EB about compiling against 0.10?
I think this is an easy fix.
On Tue, Jan 8, 2013 at 9:38 AM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I would open a new JIRA, since 1914 is focussed on building an alternative
> that discovers schema, while you are wanting to improve the existing one.
>
> Alan.
>
> On Jan 7, 2013, at 5:02 PM, Tim Sell wrote:
>
> > This seems like a bug to me. It makes it risky to work with JSON data
> > generated by something other than Pig since the ordering might change.
> > What do you think?
> >
> > I didn't see a bug for it in Jira, so would this (still open) one be
> > the place to mention it? Or should I make a new one?
> > https://issues.apache.org/jira/browse/PIG-1914
> >
> > ~T
> >
> >
> > On 7 January 2013 20:24, Alan Gates <[EMAIL PROTECTED]> wrote:
> >> Currently the JsonLoader does assume ordering of the fields.  It does
> not do any name matching against the given schema to find the right field.
> >>
> >> Alan.
> >>
> >> On Jan 7, 2013, at 11:56 AM, Tim Sell wrote:
> >>
> >>> When using JsonLoader with Pig 0.10.0
> >>>
> >>> if I have an input.json file that looks like this:
> >>>
> >>> {"date": "2007-08-25", "id": 16}
> >>> {"date": "2007-09-08", "id": 17}
> >>> {"date": "2007-09-15", "id": 18}
> >>>
> >>> And I use
> >>>
> >>> a = LOAD 'input.json' USING JsonLoader('id:int,date:chararray');
> >>> DUMP a;
> >>>
> >>> I get errors when it tries to force the date fields into an integer.
> >>>
> >>> Shouldn't this work independent of the ordering of the schema fields?
> >>> Json writers generally don't make guarantees about the ordering.
> >>>
> >>> One alternative (though annoying) would to be use elephant bird
> >>> instead, but I can't get that to compile against hadoop 2.0.0 and Pig
> >>> 0.10.0.
> >>>
> >>> ~Tim
> >>
>
>
+
Ruslan Al-Fakikh 2013-04-05, 00:51