Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> help with Map Type


Copy link to this message
-
Re: help with Map Type
On Tue, Jun 19, 2012 at 10:46 AM, Subir S <[EMAIL PROTECTED]> wrote:

> I think content in the end of this link
>
> http://elasticmapreduce.s3.amazonaws.com/samples/pig-apache/do-reports.pigwill
> help you!!
>
> thanks! I get 404 when I click on that link.
>  On Tue, Jun 19, 2012 at 10:50 PM, Subir S <[EMAIL PROTECTED]>
> wrote:
>
> > I suggest you load with 2 fields. (uri, query) split at '?' delimiter.
> >
> > Then use regex_extract to extract abc.com and regex_extract_all to
> > extract query parameters.
> >
> > Use foreach...generate to make query into a map.
> >
> >
> > On Tue, Jun 19, 2012 at 3:33 AM, Mohit Anchlia <[EMAIL PROTECTED]
> >wrote:
> >
> >> sorry that wasn't a link. It's my input to the pig. Basically what's
> >> inside
> >> params.dat. When I run those 3 pig lines I get empty output. What I want
> >> is
> >> something like this:
> >>
> >> http://abc.com/?a=v1&b=v2
> >>
> >> broken down into a map and also be able to preserve abc.com. Otherwise
> if
> >> it's complex I can write UDFs
> >>
> >>
> >> On Mon, Jun 18, 2012 at 1:04 PM, Subir S <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >> > I think link Mohit mentioned was his input. Not sure if i understood
> >> > correctly.
> >> >
> >> > I suspect something related to the schema.
> >> >
> >> > http://pig.apache.org/docs/r0.9.1/basic.html#map-schema
> >> >
> >> > http://stackoverflow.com/a/8238591
> >> >
> >> > So when you load with delimiter '&', what will happen to the first
> >> field?
> >> > and how will the second field automatically become a map...I mean in
> >> your
> >> > schema... you mention only one field...not two fields..URL&QUERY
> >> >
> >> > Thanks, Subir
> >> >
> >> > On Tue, Jun 19, 2012 at 12:20 AM, Jonathan Coveney <
> [EMAIL PROTECTED]
> >> > >wrote:
> >> >
> >> > > Your link does not work, I recommend using pastebin.
> >> > >
> >> > > 2012/6/18 Mohit Anchlia <[EMAIL PROTECTED]>
> >> > >
> >> > > > I am trying to parse URL using map type of pig. My query string
> is:
> >> > > >
> >> > > >
> >> https://mail.google.com/mail/?tab=wm#drafts/13800c4ea3d11511&mail=123
> >> > > >
> >> > > > My very simple script for testing is this. But when I look at the
> >> part
> >> > > file
> >> > > > it returns null.
> >> > > >
> >> > > > A = LOAD '/examples/map/input/params.dat' USING PigStorage('&') AS
> >> > > > (M:map[]);
> >> > > >
> >> > > > rmf '/examples/map/output/';
> >> > > >
> >> > > > STORE B INTO '/examples/map/output/';
> >> > > >
> >> > > > I am working on analyzing clickstream data. For this I need to
> first
> >> > > parse
> >> > > > these strings into files representing dimensions and also do
> >> > > sessionization
> >> > > > on them before loading it into RDBMS.
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB