Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Comparing two logs, finding missing records


Copy link to this message
-
Re: Comparing two logs, finding missing records
Interesting, Bharath, I will look at these.

Mark

On Sun, Jun 26, 2011 at 5:12 PM, Bharath Mundlapudi
<[EMAIL PROTECTED]>wrote:

> If you have Serde or PigLoader for your log format, probably Pig or Hive
> will be a quicker solution with the join.
>
> -Bharath
>
>
>
> ________________________________
> From: Mark Kerzner <[EMAIL PROTECTED]>
> To: Hadoop Discussion Group <[EMAIL PROTECTED]>
> Sent: Saturday, June 25, 2011 9:39 PM
> Subject: Comparing two logs, finding missing records
>
> Hi,
>
> I have two logs which should have all the records for the same record_id,
> in
> other words, if this record_id is found in the first log, it should also be
> found in the second one. However, I suspect that the second log is filtered
> out, and I need to find the missing records. Anything is allowed: MapReduce
> job, Hive, Pig, and even a NoSQL database.
>
> Thank you.
>
> It is also a good time to express my thanks to all the members of the group
> who are always very helpful.
>
> Sincerely,
> Mark
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB