Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> JsonLoader schema field order shouldn't matter


Copy link to this message
-
Re: JsonLoader schema field order shouldn't matter
Tim, can you open a github issue with EB about compiling against 0.10?
I think this is an easy fix.
On Tue, Jan 8, 2013 at 9:38 AM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I would open a new JIRA, since 1914 is focussed on building an alternative
> that discovers schema, while you are wanting to improve the existing one.
>
> Alan.
>
> On Jan 7, 2013, at 5:02 PM, Tim Sell wrote:
>
> > This seems like a bug to me. It makes it risky to work with JSON data
> > generated by something other than Pig since the ordering might change.
> > What do you think?
> >
> > I didn't see a bug for it in Jira, so would this (still open) one be
> > the place to mention it? Or should I make a new one?
> > https://issues.apache.org/jira/browse/PIG-1914
> >
> > ~T
> >
> >
> > On 7 January 2013 20:24, Alan Gates <[EMAIL PROTECTED]> wrote:
> >> Currently the JsonLoader does assume ordering of the fields.  It does
> not do any name matching against the given schema to find the right field.
> >>
> >> Alan.
> >>
> >> On Jan 7, 2013, at 11:56 AM, Tim Sell wrote:
> >>
> >>> When using JsonLoader with Pig 0.10.0
> >>>
> >>> if I have an input.json file that looks like this:
> >>>
> >>> {"date": "2007-08-25", "id": 16}
> >>> {"date": "2007-09-08", "id": 17}
> >>> {"date": "2007-09-15", "id": 18}
> >>>
> >>> And I use
> >>>
> >>> a = LOAD 'input.json' USING JsonLoader('id:int,date:chararray');
> >>> DUMP a;
> >>>
> >>> I get errors when it tries to force the date fields into an integer.
> >>>
> >>> Shouldn't this work independent of the ordering of the schema fields?
> >>> Json writers generally don't make guarantees about the ordering.
> >>>
> >>> One alternative (though annoying) would to be use elephant bird
> >>> instead, but I can't get that to compile against hadoop 2.0.0 and Pig
> >>> 0.10.0.
> >>>
> >>> ~Tim
> >>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB