Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> CombineInputFormat not working unless hive.hadoop.supports.splittable.combineinputformat=true


Copy link to this message
-
CombineInputFormat not working unless hive.hadoop.supports.splittable.combineinputformat=true
Is there a reason CombineInputFormat isn't working for small files unless
the hive.hadoop.supports.splittable.combineinputformat is set to true?

Additionally, when using this with enough lzo files, we run into errors of
the form:

2013-08-02 15:02:43,553 WARN com.hadoop.compression.lzo.LzopInputStream:
IOException in getCompressedData; likely LZO corruption.
java.io.IOException: Compressed length 1648850803 exceeds max block size
67108864 (probably corrupt file)
 at
com.hadoop.compression.lzo.LzopInputStream.getCompressedData(LzopInputStream.java:286)
 at
com.hadoop.compression.lzo.LzopInputStream.decompress(LzopInputStream.java:256)
 at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
 at java.io.InputStream.read(InputStream.java:85)
.....

Thanks.