Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Always mysterious "Could not obtain block" error for large jobs [Hive 0.8.1/ Hadoop 1.0.0 on Mac Mini Cluster]


Copy link to this message
-
Re: Always mysterious "Could not obtain block" error for large jobs [Hive 0.8.1/ Hadoop 1.0.0 on Mac Mini Cluster]
Another critical variable to check is dfs.datanode.max.xcievers. The
default value is 256. You should bump that up to 4096 or higher.

-Vijay

On Thu, Mar 1, 2012 at 7:07 PM, Abhishek Parolkar <[EMAIL PROTECTED]> wrote:
> Hi There!
>   I have been doing an interesting experiment of building mac mini cluster
> (http://www.scribd.com/doc/76827185/Mac-Mini-Hadoop-Cluster)
>   I am continuously getting "java.io.IOException: java.io.IOException: Could
> not obtain block: blk_-" errors when I run hive queries on large set of
> data.
>
> Queried on small set of data (10Gig) works fine but if I query on large
> (about 170G) it gives that error.
> DATA is stored as SEQUENCEFILE partition by date & hour each file of about
> 160 MB.
>
> Here is what JobTracker says about map/reduce
> : http://screencast.com/t/hQ9Y7zsaO  (more
> detail: http://screencast.com/t/jHplMXHXuys)
>
> Searching about the issue, I found that many people face this problem
> because of:
> 1.) Block not available on any of the data nodes http://bit.ly/wFGgEF
> 2.) hadoop is not able to open enough file descriptors (ulimit
> issue) http://bit.ly/wi4fg8
>
> I fixed all that and ran the query again but no luck (my ulimit -n is 65534)
>
>
> My Configuration:
> Hadoop Version: Hadoop 1.0.0
> Platform: OS X 10.7.2 (Mac mini )
> Nodes: 3 Data , 1 Namenode , 1 Jobtracker, 3 TaskTracker
> Hive version: 0.8.1
> ulimit -a of all nodes: http://pastie.org/private/ukxeuqcz31qckmn9hiqsba
> memory per node (sysctl -n hw.memsize) : 4.096G
> free_mem: 1.89G
> output of allmemory: http://pastie.org/private/drscsrbxf6dg7t9pwoc1g
>
>
> FSCK of whole external table location
> : http://pastie.org/private/ki0xxfnuaoi1xkxbkylrlw
> HDFS report : http://pastie.org/private/ahinnwty2v6exrapre65ta
>
>
> -v_abhi_v
>
>
>
>
>
>
>
>
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB