Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Handling bad records


Copy link to this message
-
Re: Handling bad records
Hi Mohit ,
 A and B refers to two different output files (multipart name). The file
names will be seq-A* and seq-B*.  Its similar to "r" in part-r-00000

On Tue, Feb 28, 2012 at 11:37 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> Thanks that's helpful. In that example what is "A" and "B" referring to? Is
> that the output file name?
>
> mos.getCollector("seq", "A", reporter).collect(key, new Text("Bye"));
> mos.getCollector("seq", "B", reporter).collect(key, new Text("Chau"));
>
>
> On Mon, Feb 27, 2012 at 9:53 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>
> > Mohit,
> >
> > Use the MultipleOutputs API:
> >
> >
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
> > to have a named output of bad records. There is an example of use
> > detailed on the link.
> >
> > On Tue, Feb 28, 2012 at 3:48 AM, Mohit Anchlia <[EMAIL PROTECTED]>
> > wrote:
> > > What's the best way to write records to a different file? I am doing
> xml
> > > processing and during processing I might come accross invalid xml
> format.
> > > Current I have it under try catch block and writing to log4j. But I
> think
> > > it would be better to just write it to an output file that just
> contains
> > > errors.
> >
> >
> >
> > --
> > Harsh J
> >
>

--
Join me at http://hadoopworkshop.eventbrite.com/