Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - How to IO catch exceptions using python


+
xavier.quintuna@... 2009-10-19, 17:58
Copy link to this message
-
Re: How to IO catch exceptions using python
Jeff Hammerbacher 2009-10-19, 18:02
Hey Xavier,

The functionality you are looking for was added to 0.19 and above:
http://issues.apache.org/jira/browse/HADOOP-3828. If you upgrade your
cluster to CDH2, you should be good to go.

Regards,
Jeff

On Mon, Oct 19, 2009 at 10:58 AM, <[EMAIL PROTECTED]>wrote:

> Hi Everybody,
>
> I'm doing a project where I have to read a large set of compress files
> (gz). I'm using python and streaming to achieve my goals. However, I
> have a problem, there are corrupt compress files that are killing my
> map/reduce jobs.
> My environment is the following:
> Hadoop-0.18.3 (CDH1)
>
>
> Do you guys have some recommendations how to manage this case?
> How I can catch that exception using python so that my jobs don't fail?
> How I can identify these files using python and move them to a corrupt
> file folder?
>
> I really appreciate any recommendation
>
> Xavier
>
>
+
xavier.quintuna@... 2009-10-19, 18:53