|
|
-
Can spill to disk be in compressed format to reduce I/O?
Frank Grimes 2012-01-12, 15:40
Hi All,
We're trying to speed up an M/R job which combines multiple .avro files. We've noticed that when it spills to disk, it's in uncompressed format. Is there a way to make it spill temporary segments as .avro with Deflate compression?
Thanks,
Frank Grimes
-
Re: Can spill to disk be in compressed format to reduce I/O?
bejoy.hadoop@... 2012-01-12, 15:49
Hi Frank Is map output compression enabled?
The config param would be like mapred.map.output.compress=true (It is from my memory, Please cross check)
------Original Message------ From: Frank Grimes To: [EMAIL PROTECTED] ReplyTo: [EMAIL PROTECTED] Subject: Can spill to disk be in compressed format to reduce I/O? Sent: Jan 12, 2012 21:10
Hi All,
We're trying to speed up an M/R job which combines multiple .avro files. We've noticed that when it spills to disk, it's in uncompressed format. Is there a way to make it spill temporary segments as .avro with Deflate compression?
Thanks,
Frank Grimes
Regards Bejoy K S
-
Re: Can spill to disk be in compressed format to reduce I/O?
Frank Grimes 2012-01-12, 16:08
I tried conf.setBoolean("mapred.compress.map.output", true); but it didn't seem to work.
Also, since I'm using the Avro mapred APIs, maybe there's something Avro specific to get it enabled? Should I ask on the Avro mailing lists?
Thanks,
Frank Grimes On 2012-01-12, at 10:49 AM, [EMAIL PROTECTED] wrote:
> Hi Frank > Is map output compression enabled? > > The config param would be like > mapred.map.output.compress=true > (It is from my memory, Please cross check) > > ------Original Message------ > From: Frank Grimes > To: [EMAIL PROTECTED] > ReplyTo: [EMAIL PROTECTED] > Subject: Can spill to disk be in compressed format to reduce I/O? > Sent: Jan 12, 2012 21:10 > > Hi All, > > We're trying to speed up an M/R job which combines multiple .avro files. > We've noticed that when it spills to disk, it's in uncompressed format. > Is there a way to make it spill temporary segments as .avro with Deflate compression? > > Thanks, > > Frank Grimes > > Regards > Bejoy K S
-
Re: Can spill to disk be in compressed format to reduce I/O?
Arun C Murthy 2012-01-12, 23:15
Temporary map-ouput files don't use Avro format. There is a custom format which should be compressed if you set mapred.compress.map.output.
Arun
On Jan 12, 2012, at 8:08 AM, Frank Grimes wrote:
> I tried conf.setBoolean("mapred.compress.map.output", true); but it didn't seem to work. > > Also, since I'm using the Avro mapred APIs, maybe there's something Avro specific to get it enabled? > Should I ask on the Avro mailing lists? > > Thanks, > > Frank Grimes > > > On 2012-01-12, at 10:49 AM, [EMAIL PROTECTED] wrote: > >> Hi Frank >> Is map output compression enabled? >> >> The config param would be like >> mapred.map.output.compress=true >> (It is from my memory, Please cross check) >> >> ------Original Message------ >> From: Frank Grimes >> To: [EMAIL PROTECTED] >> ReplyTo: [EMAIL PROTECTED] >> Subject: Can spill to disk be in compressed format to reduce I/O? >> Sent: Jan 12, 2012 21:10 >> >> Hi All, >> >> We're trying to speed up an M/R job which combines multiple .avro files. >> We've noticed that when it spills to disk, it's in uncompressed format. >> Is there a way to make it spill temporary segments as .avro with Deflate compression? >> >> Thanks, >> >> Frank Grimes >> >> Regards >> Bejoy K S >
-
RE: Can spill to disk be in compressed format to reduce I/O?
Tim Broberg 2012-01-12, 23:25
So, the initial input stream is decompressed, then each temporary file gets compressed and decompressed, and the avro output is then recompressed, decompressed again at the reducers?
I'm counting 2 compressions and 2 decompressions at the mappers and 1 decompression at the reducers.
Am I getting this right?
- Tim. ________________________________________ From: Arun C Murthy [[EMAIL PROTECTED]] Sent: Thursday, January 12, 2012 3:15 PM To: [EMAIL PROTECTED] Subject: Re: Can spill to disk be in compressed format to reduce I/O?
Temporary map-ouput files don't use Avro format. There is a custom format which should be compressed if you set mapred.compress.map.output.
Arun
On Jan 12, 2012, at 8:08 AM, Frank Grimes wrote:
> I tried conf.setBoolean("mapred.compress.map.output", true); but it didn't seem to work. > > Also, since I'm using the Avro mapred APIs, maybe there's something Avro specific to get it enabled? > Should I ask on the Avro mailing lists? > > Thanks, > > Frank Grimes > > > On 2012-01-12, at 10:49 AM, [EMAIL PROTECTED] wrote: > >> Hi Frank >> Is map output compression enabled? >> >> The config param would be like >> mapred.map.output.compress=true >> (It is from my memory, Please cross check) >> >> ------Original Message------ >> From: Frank Grimes >> To: [EMAIL PROTECTED] >> ReplyTo: [EMAIL PROTECTED] >> Subject: Can spill to disk be in compressed format to reduce I/O? >> Sent: Jan 12, 2012 21:10 >> >> Hi All, >> >> We're trying to speed up an M/R job which combines multiple .avro files. >> We've noticed that when it spills to disk, it's in uncompressed format. >> Is there a way to make it spill temporary segments as .avro with Deflate compression? >> >> Thanks, >> >> Frank Grimes >> >> Regards >> Bejoy K S >
The information and any attached documents contained in this message may be confidential and/or legally privileged. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, dissemination, or reproduction is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender immediately by return e-mail and destroy all copies of the original message.
|
|