|
|
-
Re: Hadoop problemsMarcos Ortiz 2013-02-17, 05:42
In the next ApacheCon, Kathleen Ting, one of Cloudera�s Custome
Operations Engineer will give a talk related to this topic. I don�t have the exact link right now, but you can easily find it looking in the Big Data track of the conference. She did another similar talk in the Hadoop World 2011. You can see it here[1] Then, you should use "Hadoop Operations" book, written by Eric Sammer, Engineering Manager at Cloudera and an expert in all this stuff. Both guys talk always about that Clusters misconfiguration is the primary cause of cluster failures. Like you said, disk failure is a possible cause too, but there are more: - Disk full - Too many open files for a particular user - JVM and GC related issues - Use of OpenJDK VM instead Oracle Java VM - NTP synhcronization issues - SSH related issues - and many more [1] http://bit.ly/cloudera_talk Best wishes El 16/02/2013 23:18, Henjarappa, Savitha escribi�: > All, > What are the most common problems that an Hadoop Administrator should > be on top of? > What would be the possible reasons for a job failure? I understand > disk failure is one of the reason. > Thanks, > Savitha -- Marcos Ort�z Valmaseda Product Manager && Data Scientist at UCI Blog: http://marcosluis2186.posterous.com LinkedIn: http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 <https://twitter.com/marcosluis2186> |