Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Hadoop problems


Copy link to this message
-
Re: Hadoop problems
In the next ApacheCon, Kathleen Ting, one of Cloudera�s Custome
Operations Engineer will
give a talk related to this topic. I don�t have the exact link right
now, but you can easily find it looking in the Big Data track of the
conference. She did another similar talk in the Hadoop World 2011. You
can see it here[1]

Then, you should use "Hadoop Operations" book, written by Eric Sammer,
Engineering Manager at Cloudera and an expert in all this stuff.

Both guys talk always about that Clusters misconfiguration is the
primary cause of
cluster failures. Like you said, disk failure is a possible cause too,
but there are more:
- Disk full
- Too many open files for a particular user
- JVM and GC related issues
- Use of OpenJDK VM instead Oracle Java VM
- NTP synhcronization issues
- SSH related issues
- and many more
[1] http://bit.ly/cloudera_talk

  Best wishes
El 16/02/2013 23:18, Henjarappa, Savitha escribi�:
> All,
> What are the most common problems that an Hadoop Administrator should
> be on top of?
> What would be the possible reasons for a job failure? I understand
> disk failure is one of the reason.
> Thanks,
> Savitha

-- Marcos Ort�z Valmaseda
Product Manager && Data Scientist at UCI
Blog: http://marcosluis2186.posterous.com
LinkedIn: http://www.linkedin.com/in/marcosluis2186
Twitter: @marcosluis2186 <https://twitter.com/marcosluis2186>