Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Corrupted input data to map


Copy link to this message
-
Re: Corrupted input data to map
You can read the input as plain text then do type conversion in
mapper, if there's NumberFormatException happens, you can decide how
to do with it , like add a customized Counter to record it. or set a
default value

On Sat, Oct 16, 2010 at 5:02 AM, Boyu Zhang <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am running a program with input 1 million lines of data, among the 1
> million, 5 or 6 lines data are corrupted. The way the are corrupted is: in
> the position which a float number is expected, like 3.4 , instead of a float
> number, something like this is there: 3.4.5.6 . So when the map runs, it
> throws a multiple point in num exception.
>
> My question is: the map tasks that have the exception are marked failure,
> how about the data processed by the same map before the exception, do they
> reach the reduce task? or they are treated like garbage? Thank you very much
> any help is appreciated.
>
> Boyu
>

--
Best Regards

Jeff Zhang