Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: How to compress MapFile programmatically


Copy link to this message
-
Re: How to compress MapFile programmatically
A MapFile.Reader will automatically detect and decompress without
needing to be told anything special. You needn't have to worry about
decompressing files by yourself in Apache Hadoop generally - the
framework handles it for you transparently if you're using the proper APIs.

On Sun, Aug 11, 2013 at 8:49 PM, Abhijit Sarkar
<[EMAIL PROTECTED]> wrote:
> Thanks Harsh. However, if I compress the MapFile using the MapFile.Writer
> Constructor option and then put it in a DistributedCache, how do I
> uncompress it in the Map/Reduce? There isn't any API method to do that
> apparently.
>
> Regards,
> Abhijit
>
>> From: [EMAIL PROTECTED]
>> Date: Sun, 11 Aug 2013 12:56:43 +0530
>> Subject: Re: How to compress MapFile programmatically
>> To: [EMAIL PROTECTED]
>
>>
>> A MapFile isn't a directory. It is a directory _containing_ two files.
>> You cannot "open" a directory for reading.
>>
>> The MapFile API is documented at
>> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/io/MapFile.html
>> and thats what you're to be using for reading/writing them.
>>
>> Compression is a simple option you need to provide when invoking the
>> writer:
>> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/io/MapFile.Writer.html#MapFile.Writer(org.apache.hadoop.conf.Configuration,%20org.apache.hadoop.fs.FileSystem,%20java.lang.String,%20org.apache.hadoop.io.WritableComparator,%20java.lang.Class,%20org.apache.hadoop.io.SequenceFile.CompressionType,%20org.apache.hadoop.io.compress.CompressionCodec,%20org.apache.hadoop.util.Progressable)
>>
>> On Sun, Aug 11, 2013 at 1:46 AM, Abhijit Sarkar
>> <[EMAIL PROTECTED]> wrote:
>> > Hi,
>> > I'm a Hadoop newbie. This is my first question to this mailing list,
>> > hoping
>> > for a good start :)
>> >
>> > MapFile is a directory so when I try to open an InputStream to it, it
>> > fails
>> > with FileNotFoundException. How do I compress MapFile programmatically?
>> >
>> > Code snippet:
>> > final FileSystem fs = FileSystem.get(conf);
>> > final InputStream inputStream = fs.open(new Path(uncompressedStr));
>> >
>> > Exception:
>> > java.io.FileNotFoundException: /some/directory (No such file or
>> > directory)
>> > at java.io.FileInputStream.open(Native Method)
>> > at java.io.FileInputStream.<init>(FileInputStream.java:120)
>> > at
>> >
>> > org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.<init>(RawLocalFileSystem.java:71)
>> > at
>> >
>> > org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:107)
>> > at
>> > org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:177)
>> > at
>> >
>> > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:126)
>> > at
>> > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
>> > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
>> > at
>> > name.abhijitsarkar.learning.hadoop.io.IOUtils.compress(IOUtils.java:104)
>> >
>> > Regards,
>> > Abhijit
>>
>>
>>
>> --
>> Harsh J

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB