Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Possible bug in NULL fields handling

Copy link to this message
Possible bug in NULL fields handling
Vincent BARAT 2009-10-15, 12:51

I'm not sure if it's a bug, but the handling of NULL fields seems
not to work correctly:

My data (events):


My script:

events = load 'events' using PigStorage(',') AS
(sessionid:chararray, jobid:chararray, user:chararray);
user_events = group events by user;
dump user_events;
event_count_by_user = foreach user_events generate group, COUNT(events);
dump event_count_by_user;

The results:

user_events (correct):

event_count_by_user (incorrect):

event_count_by_user should be:


It seems that tuples starting with (, are not counted correctly.

Any suggestion?

Thanks a lot