Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Comparing two logs, finding missing records


Copy link to this message
-
Comparing two logs, finding missing records
Mark Kerzner 2011-06-26, 04:39
Hi,

I have two logs which should have all the records for the same record_id, in
other words, if this record_id is found in the first log, it should also be
found in the second one. However, I suspect that the second log is filtered
out, and I need to find the missing records. Anything is allowed: MapReduce
job, Hive, Pig, and even a NoSQL database.

Thank you.

It is also a good time to express my thanks to all the members of the group
who are always very helpful.

Sincerely,
Mark