Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Working on multiple rows


Copy link to this message
-
Re: Working on multiple rows
Thejas M Nair 2010-08-29, 01:21
Can you give the multiple rows an id and use that ? In your example , can
you assign a user-group id for each type of user (or maybe a map with
attributes if a user can belong to multiple groups), and then process using
that attribute or id ?
(I might not have understood the problem correctly, example of input and
output data might help)
-Thejas

On 8/28/10 11:10 AM, "Christian Decker" <[EMAIL PROTECTED]> wrote:

> The title might be a bit misleading but I hope you can help me.
> I have some data (let's say a Web Log file) and I want to be able to compare
> multiple items with each other. For example I want to know what items are
> popular in certain user groups, which means that I want to find items which
> got many successive hits from users from that group in a short period of
> time.
> Until now I only worked on the rows in an isolated manner, that is items
> could be filtered or modified, without any knowledge of other records, but
> this now requires to consider multiple records, and I have no clue as to how
> approach this problem.
>
> Any suggestions?
>
> Regards,
> Chris
>