Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> help with Map Type


Copy link to this message
-
Re: help with Map Type
sorry that wasn't a link. It's my input to the pig. Basically what's inside
params.dat. When I run those 3 pig lines I get empty output. What I want is
something like this:

http://abc.com/?a=v1&b=v2

broken down into a map and also be able to preserve abc.com. Otherwise if
it's complex I can write UDFs
On Mon, Jun 18, 2012 at 1:04 PM, Subir S <[EMAIL PROTECTED]> wrote:

> I think link Mohit mentioned was his input. Not sure if i understood
> correctly.
>
> I suspect something related to the schema.
>
> http://pig.apache.org/docs/r0.9.1/basic.html#map-schema
>
> http://stackoverflow.com/a/8238591
>
> So when you load with delimiter '&', what will happen to the first field?
> and how will the second field automatically become a map...I mean in your
> schema... you mention only one field...not two fields..URL&QUERY
>
> Thanks, Subir
>
> On Tue, Jun 19, 2012 at 12:20 AM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
>
> > Your link does not work, I recommend using pastebin.
> >
> > 2012/6/18 Mohit Anchlia <[EMAIL PROTECTED]>
> >
> > > I am trying to parse URL using map type of pig. My query string is:
> > >
> > > https://mail.google.com/mail/?tab=wm#drafts/13800c4ea3d11511&mail=123
> > >
> > > My very simple script for testing is this. But when I look at the part
> > file
> > > it returns null.
> > >
> > > A = LOAD '/examples/map/input/params.dat' USING PigStorage('&') AS
> > > (M:map[]);
> > >
> > > rmf '/examples/map/output/';
> > >
> > > STORE B INTO '/examples/map/output/';
> > >
> > > I am working on analyzing clickstream data. For this I need to first
> > parse
> > > these strings into files representing dimensions and also do
> > sessionization
> > > on them before loading it into RDBMS.
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB