Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Bad records


Copy link to this message
-
Re: Bad records
The job is failing because of exceptions parsing records, presumably. Trace
your exception from logs, wrap the parsing code that is failing in
try/catch. Increment counters and continue in your catch. Consider adding a
record check as the first thing your mapper does.

On Sat, Jul 7, 2012 at 3:21 PM, Abhishek <[EMAIL PROTECTED]> wrote:

> hi Russell,
>
> Thanks for the answer, can I know how would I skip bad records in
> mapreduce code
>
> Regards
> Abhi
>
> Sent from my iPhone
>
> On Jul 7, 2012, at 5:22 PM, Russell Jurney <[EMAIL PROTECTED]>
> wrote:
>
> > Throw, catch and handle an exception on bad records.  Don't error out.
>  Log
> > the error in your exception handler, increment a counter.
> >
> > For general discussion, see:
> >
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
> >
> > On Sat, Jul 7, 2012 at 1:41 PM, Abhishek <[EMAIL PROTECTED]>
> wrote:
> >
> >> Hi all,
> >>
> >> If the job is failing because of some bad records.How would I know which
> >> records are bad.Can I put them in log file and skip those records
> >>
> >> Regards
> >> Abhi
> >>
> >>
> >> Sent from my iPhone
> >>
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> datasyndrome.com
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB