Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Subtracting values


Copy link to this message
-
Re: Subtracting values
I haven't understood your data/schema.

I am hoping this is close to what you are trying to solve -
schema Inp: (timestamp : int, user, url);

user_url_group = group inp by (user, url);
session_duration = foreach user_url_group generate group.user as user,
group.url as url, MAX(inp.timestamp) - MIN(inp.timestamp) as duration;

-Thejas

On 1/25/12 2:12 AM, David Houston wrote:
> Hi,
>
> I have an group of records that gets outputted like the below.
>
> ((1010046645226466896,http://www.url.com/),1277793285)
> ((1010046645226466896,http:///www.url.com/?image=580),1277793315)
> ((1010046645226466896,http:///www.url.com/?image=582),1277793359)
> ((1010046645226466896,http:///www.url.com/?image=582),1277793470)
> ((1010046645226466896,http:///www.url.com/?image=585),1277793387)
>
>
> The code that gets me here is;
>
> ht = FOREACH A GENERATE CONCAT(visid_high,visid_low) AS guid, service, hit_time_gmt, page_url as url;
>
> grpd = GROUP ht BY (guid, url) PARALLEL 20;
>
> B = FOREACH grpd {
> t = DISTINCT ht.hit_time_gmt;
>
> GENERATE group, flatten(t);
> }
>
>
> What I'm having difficultly doing is working out how I would subtract next value from the last to work out how long a user spent on each page.
>
> Any help would be greatly appreciated.
>
>
> Many thanks
>
> Dave
> #####################################################################################
> Note:
>
> Any views or opinions are solely those of the author and do not necessarily represent
> those of Channel Four Television Corporation unless specifically stated. This email
> and any files transmitted are confidential and intended solely for the use of the
> individual or entity to which they are addressed. If you have received this email in
> error, please notify [EMAIL PROTECTED]
>
> Thank You.
>
> Channel Four Television Corporation, created by statute under English law, is at 124 Horseferry Road, London, SW1P 2TX .
>
> 4 Ventures Limited (Company No. 04106849), incorporated in England and Wales has its registered office at 124 Horseferry Road, London SW1P 2TX.
>
> VAT no: GB 626475817
>
> #####################################################################################
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB