Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Seekable interface and CompressInputStream question


Copy link to this message
-
Seekable interface and CompressInputStream question

Hi,
I have a question related to Seekable interface. Right now I am using the CDH3 release, with hadoop 0.20.2. I understand in it, the CompressInputStream will throw UnsupportedException in methods inherited from Seekable interface, as they are not implemented.
My question is that does Seekable mean the underline InputStream will support Split? As if an InputStream can be seekable, then it should be able to split, right?
If so, in the future release, I assume that CompressInputStream will implement Seekable in hadoop. But my understand is that some compression can be split, some cannot. If the data file is gzip file, and let's say that I get a CompressInputStream does support Seekable, with codec of Gzip codec, I will assume it is Splitable, but in fact it isn't. How do I write a generic InputFormat to support both splitable/unsplitable compress input stream in this case? Or my understanding is not correct, that Seekable and Split are totally different things?
Thanks
Yong    
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB