Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> CombineInputFormat not working unless hive.hadoop.supports.splittable.combineinputformat=true


Copy link to this message
-
CombineInputFormat not working unless hive.hadoop.supports.splittable.combineinputformat=true
Is there a reason CombineInputFormat isn't working for small files unless
the hive.hadoop.supports.splittable.combineinputformat is set to true?

Additionally, when using this with enough lzo files, we run into errors of
the form:

2013-08-02 15:02:43,553 WARN com.hadoop.compression.lzo.LzopInputStream:
IOException in getCompressedData; likely LZO corruption.
java.io.IOException: Compressed length 1648850803 exceeds max block size
67108864 (probably corrupt file)
 at
com.hadoop.compression.lzo.LzopInputStream.getCompressedData(LzopInputStream.java:286)
 at
com.hadoop.compression.lzo.LzopInputStream.decompress(LzopInputStream.java:256)
 at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
 at java.io.InputStream.read(InputStream.java:85)
.....

Thanks.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB