|
|
-
part-00000.deflate as output
Mark Kerzner 2009-11-26, 05:33
Hi,
I get this part-00000.deflate instead of part-00000.
How do I get rid of the deflate option?
Thank you, Mark
-
Re: part-00000.deflate as output
Amogh Vasekar 2009-11-26, 07:06
Hi, ".deflate" is the default compression codec used when parameter to generate compressed output is true ( mapred.output.compress ). You may set the codec to be used via mapred.output.compression.codec, some commonly used are available in hadoop.io.compress package...
Amogh On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote:
Hi,
I get this part-00000.deflate instead of part-00000.
How do I get rid of the deflate option?
Thank you, Mark
-
Re: part-00000.deflate as output
Tim Kiefer 2009-11-26, 07:10
For testing purposes you can also try to disable the compression:
conf.setBoolean("mapred.output.compress", false);
Then you can look at the output.
- tim
Amogh Vasekar wrote: > Hi, > ".deflate" is the default compression codec used when parameter to generate compressed output is true ( mapred.output.compress ). > You may set the codec to be used via mapred.output.compression.codec, some commonly used are available in hadoop.io.compress package... > > Amogh > > > On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote: > > Hi, > > I get this part-00000.deflate instead of part-00000. > > How do I get rid of the deflate option? > > Thank you, > Mark > > >
-
Re: part-00000.deflate as output
Mark Kerzner 2009-11-27, 00:59
It worked!
But why is it "for testing?" I only have one job, so I need by related as text, can I use this fix all the time?
Thank you, Mark
On Thu, Nov 26, 2009 at 1:10 AM, Tim Kiefer <[EMAIL PROTECTED]> wrote:
> For testing purposes you can also try to disable the compression: > > conf.setBoolean("mapred.output.compress", false); > > Then you can look at the output. > > - tim > > > Amogh Vasekar wrote: > >> Hi, >> ".deflate" is the default compression codec used when parameter to >> generate compressed output is true ( mapred.output.compress ). >> You may set the codec to be used via mapred.output.compression.codec, some >> commonly used are available in hadoop.io.compress package... >> >> Amogh >> >> >> On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> I get this part-00000.deflate instead of part-00000. >> >> How do I get rid of the deflate option? >> >> Thank you, >> Mark >> >> >> >> >
-
Re: part-00000.deflate as output
Aaron Kimball 2009-11-27, 18:44
You are always free to run with compression disabled. But in many production situations, space or performance concerns dictate that all data sets are stored compressed, so I think Tim was assuming that you might be operating in such an environment -- in which case, you'd only need things to appear in plaintext if a human operator is inspecting the output for debugging.
- Aaron
On Thu, Nov 26, 2009 at 4:59 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote:
> It worked! > > But why is it "for testing?" I only have one job, so I need by related as > text, can I use this fix all the time? > > Thank you, > Mark > > On Thu, Nov 26, 2009 at 1:10 AM, Tim Kiefer <[EMAIL PROTECTED]> wrote: > > > For testing purposes you can also try to disable the compression: > > > > conf.setBoolean("mapred.output.compress", false); > > > > Then you can look at the output. > > > > - tim > > > > > > Amogh Vasekar wrote: > > > >> Hi, > >> ".deflate" is the default compression codec used when parameter to > >> generate compressed output is true ( mapred.output.compress ). > >> You may set the codec to be used via mapred.output.compression.codec, > some > >> commonly used are available in hadoop.io.compress package... > >> > >> Amogh > >> > >> > >> On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote: > >> > >> Hi, > >> > >> I get this part-00000.deflate instead of part-00000. > >> > >> How do I get rid of the deflate option? > >> > >> Thank you, > >> Mark > >> > >> > >> > >> > > >
-
Re: part-00000.deflate as output
Patrick Angeles 2009-11-27, 19:12
You can always do
hadoop fs -text <filename>
This will 'cat' the file for you, and decompress it if necessary.
On Thu, Nov 26, 2009 at 7:59 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote:
> It worked! > > But why is it "for testing?" I only have one job, so I need by related as > text, can I use this fix all the time? > > Thank you, > Mark > > On Thu, Nov 26, 2009 at 1:10 AM, Tim Kiefer <[EMAIL PROTECTED]> wrote: > > > For testing purposes you can also try to disable the compression: > > > > conf.setBoolean("mapred.output.compress", false); > > > > Then you can look at the output. > > > > - tim > > > > > > Amogh Vasekar wrote: > > > >> Hi, > >> ".deflate" is the default compression codec used when parameter to > >> generate compressed output is true ( mapred.output.compress ). > >> You may set the codec to be used via mapred.output.compression.codec, > some > >> commonly used are available in hadoop.io.compress package... > >> > >> Amogh > >> > >> > >> On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote: > >> > >> Hi, > >> > >> I get this part-00000.deflate instead of part-00000. > >> > >> How do I get rid of the deflate option? > >> > >> Thank you, > >> Mark > >> > >> > >> > >> > > >
-
Re: part-00000.deflate as output
Mark Kerzner 2009-11-27, 19:25
Thank you, guys, for your very useful answers
Mark
On Fri, Nov 27, 2009 at 12:44 PM, Aaron Kimball <[EMAIL PROTECTED]> wrote:
> You are always free to run with compression disabled. But in many > production > situations, space or performance concerns dictate that all data sets are > stored compressed, so I think Tim was assuming that you might be operating > in such an environment -- in which case, you'd only need things to appear > in > plaintext if a human operator is inspecting the output for debugging. > > - Aaron > > On Thu, Nov 26, 2009 at 4:59 PM, Mark Kerzner <[EMAIL PROTECTED]> > wrote: > > > It worked! > > > > But why is it "for testing?" I only have one job, so I need by related as > > text, can I use this fix all the time? > > > > Thank you, > > Mark > > > > On Thu, Nov 26, 2009 at 1:10 AM, Tim Kiefer <[EMAIL PROTECTED]> wrote: > > > > > For testing purposes you can also try to disable the compression: > > > > > > conf.setBoolean("mapred.output.compress", false); > > > > > > Then you can look at the output. > > > > > > - tim > > > > > > > > > Amogh Vasekar wrote: > > > > > >> Hi, > > >> ".deflate" is the default compression codec used when parameter to > > >> generate compressed output is true ( mapred.output.compress ). > > >> You may set the codec to be used via mapred.output.compression.codec, > > some > > >> commonly used are available in hadoop.io.compress package... > > >> > > >> Amogh > > >> > > >> > > >> On 11/26/09 11:03 AM, "Mark Kerzner" <[EMAIL PROTECTED]> wrote: > > >> > > >> Hi, > > >> > > >> I get this part-00000.deflate instead of part-00000. > > >> > > >> How do I get rid of the deflate option? > > >> > > >> Thank you, > > >> Mark > > >> > > >> > > >> > > >> > > > > > >
|
|