Keith Wiley 2013-02-14, 22:20
0.19 is really old and thats probably why the Text utility (fs -text)
doesn't support automatic decompression based on extensions (or
specifically, of .deflate).
Did the job.xml of the job that produced this output also carry
mapred.output.compress=false in it? The file should be viewable on the
JT UI page for the job. Unless explicitly turned out, even 0.19
wouldn't have enabled compression on its own.
On Fri, Feb 15, 2013 at 3:50 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
> I just got hadoop running on EC2 (0.19 just because that's the AMI the scripts seemed to go for). The PI example worked and I believe the wordcount example worked too. However, the output file is in .deflate format. "hadoop fs -text" fails to decompress the file -- it produces the same binary output as "hadoop fs -cat", which I find counterintuitive; isn't -text specifically supposed to handle this situation?
> I copied the file to local and tried manually decompressing it with gunzip and lzop (by appending appropriate suffixes), but both tools failed to recognize the file. To add to the confusion, I see this in the default configuration offered by the EC2 scripts:
> <description>Should the job outputs be compressed?
> ...so I don't understand why the output was compressed in the first place.
> At this point, I'm kind of stuck. The output shouldn't be compressed to begin with, and all attempts to decompress it have failed.
> Any ideas?
> Keith Wiley [EMAIL PROTECTED] keithwiley.com music.keithwiley.com
> "And what if we picked the wrong religion? Every week, we're just making God
> madder and madder!"
> -- Homer Simpson
Keith Wiley 2013-02-14, 23:46
Marcos Ortiz Valmaseda 2013-02-15, 03:09
Keith Wiley 2013-02-14, 23:35
Harsh J 2013-02-14, 23:39