Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Make job output be a comma separated file


Copy link to this message
-
RE: Make job output be a comma separated file
Botelho, Andrew 2013-07-18, 18:19
I am using the latest version of Hadoop - Hadoop 2.0.5.

From: Ravi Kiran [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 18, 2013 2:16 PM
To: [EMAIL PROTECTED]
Subject: Re: Make job output be a comma separated file

Hi Andrew,

  Can you please tell me which version of Hadoop you use.. I noticed that in Hadoop 1.0.4 , the class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat is looking for mapred.textoutputformat.separator .
Regards
Ravi M

On Thu, Jul 18, 2013 at 11:32 PM, Botelho, Andrew <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I believe that mapred.textoutputformat.separator is from the old API, but now the field is mapreduce.output.textoutputformat.separator in the new API.
So I ran this code in my driver class, but it is making no difference:

Configuration conf = new Configuration();
conf.set("mapreduce.output.textoutputformat.separator", ",");

Am I changing the field right?

Thanks,
Andrew

From: Ravi Kiran [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Thursday, July 18, 2013 1:45 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Make job output be a comma separated file

Hi Andrew,

    You can pass change the default keyValueSeparator of the output format from a "\t" to a "," by
setting the following property mapred.textoutputformat.separator to Configuration of the job.

   You will face difficulties if this output is an input to another job as you wouldn't know what part of the row data is a key and what is the value.

Regards
Ravi M.

On Thu, Jul 18, 2013 at 10:46 PM, Botelho, Andrew <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
What is the best way to make the output of my Hadoop job be comma separated?  Basically, how can I have the keys and values be separated by a comma?
My keys are Text objects, and some of them have actual commas within the field.  Will this matter?

Thanks,

Andrew