|
|
jamal sasha 2012-11-20, 19:38
Hi,
I wrote a simple map reduce job in hadoop streaming.
I am wondering if I am doing something wrong ..
While number of mappers are projected to be around 1700.. reducers.. just 1?
It’s couple of TB’s worth of data.
What can I do to address this.
Basically mapper looks like this
For line in sys.stdin:
Print line
Reducer
For line in sys.stdin:
New_line = process_line(line)
Print new_line
Thanks
+
jamal sasha 2012-11-20, 19:38
Bejoy KS 2012-11-20, 20:09
Hi Sasha
By default the number or reducers are set to be 1. If you want more you need to specify it as
hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ... Regards Bejoy KS
Sent from handheld, please excuse typos.
-----Original Message----- From: jamal sasha <[EMAIL PROTECTED]> Date: Tue, 20 Nov 2012 14:38:54 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: number of reducers
Hi,
I wrote a simple map reduce job in hadoop streaming.
I am wondering if I am doing something wrong ..
While number of mappers are projected to be around 1700.. reducers.. just 1?
It’s couple of TB’s worth of data.
What can I do to address this.
Basically mapper looks like this
For line in sys.stdin:
Print line
Reducer
For line in sys.stdin:
New_line = process_line(line)
Print new_line
Thanks
+
Bejoy KS 2012-11-20, 20:09
Kartashov, Andy 2012-11-20, 21:50
I specify mine inside mapred-site.xml
<property> <name>mapred.reduce.tasks</name> <value>20</value> </property>
Rgds, AK47 From: Bejoy KS [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 20, 2012 3:10 PM To: [EMAIL PROTECTED] Subject: Re: number of reducers
Hi Sasha
By default the number or reducers are set to be 1. If you want more you need to specify it as
hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ... Regards Bejoy KS
Sent from handheld, please excuse typos. ________________________________ From: jamal sasha <[EMAIL PROTECTED]> Date: Tue, 20 Nov 2012 14:38:54 -0500 To: <[EMAIL PROTECTED]> ReplyTo: [EMAIL PROTECTED] Subject: number of reducers
Hi,
I wrote a simple map reduce job in hadoop streaming.
I am wondering if I am doing something wrong ..
While number of mappers are projected to be around 1700.. reducers.. just 1?
It's couple of TB's worth of data.
What can I do to address this.
Basically mapper looks like this
For line in sys.stdin:
Print line
Reducer
For line in sys.stdin:
New_line = process_line(line)
Print new_line
Thanks
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel
+
Kartashov, Andy 2012-11-20, 21:50
alxsss@... 2012-11-20, 22:00
What is the relationship between number of reducers and cpu cores in your setup? I read somewhere that it must be .5 of number of cpu cores.
Thanks. Alex.
-----Original Message----- From: Kartashov, Andy <[EMAIL PROTECTED]> To: user <[EMAIL PROTECTED]>; bejoy.hadoop <[EMAIL PROTECTED]> Sent: Tue, Nov 20, 2012 1:51 pm Subject: RE: number of reducers
I specify mine inside mapred-site.xml <property> <name>mapred.reduce.tasks</name> <value>20</value> </property> Rgds, AK47
From: Bejoy KS [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 20, 2012 3:10 PM To: [EMAIL PROTECTED] Subject: Re: number of reducers
Hi Sasha
By default the number or reducers are set to be 1. If you want more you need to specify it as
hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ...
Regards Bejoy KS
Sent from handheld, please excuse typos.
From: jamal sasha <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2012 14:38:54 -0500
To: <[EMAIL PROTECTED]>
ReplyTo: [EMAIL PROTECTED]
Subject: number of reducers
Hi,
I wrote a simple map reduce job in hadoop streaming.
I am wondering if I am doing something wrong ..
While number of mappers are projected to be around 1700.. reducers.. just 1?
It’s couple of TB’s worth of data.
What can I do to address this.
Basically mapper looks like this
For line in sys.stdin:
Print line
Reducer
For line in sys.stdin:
New_line = process_line(line)
Print new_line
Thanks NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
+
alxsss@... 2012-11-20, 22:00
jamal sasha 2012-11-20, 20:24
Awesome thanks . Works great now
On Tuesday, November 20, 2012, Bejoy KS <[EMAIL PROTECTED]> wrote: > Hi Sasha > > By default the number or reducers are set to be 1. If you want more you need to specify it as > > hadoop jar myJar.jar myClass -D mapred.reduce.tasks=20 ... > > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > ________________________________ > From: jamal sasha <[EMAIL PROTECTED]> > Date: Tue, 20 Nov 2012 14:38:54 -0500 > To: <[EMAIL PROTECTED]> > ReplyTo: [EMAIL PROTECTED] > Subject: number of reducers > > > Hi, > > I wrote a simple map reduce job in hadoop streaming. > > > > I am wondering if I am doing something wrong .. > > While number of mappers are projected to be around 1700.. reducers.. just 1? > > It’s couple of TB’s worth of data. > > What can I do to address this. > > Basically mapper looks like this > > > > For line in sys.stdin: > > Print line > > > > Reducer > > For line in sys.stdin: > > New_line = process_line(line) > > Print new_line > > > > > > Thanks > > >
+
jamal sasha 2012-11-20, 20:24
Harsh J 2012-11-21, 04:08
Hey Jamal, I'd recommend first going over the whole tutorial to get a good grip on how Hadoop MR is designed to work: http://hadoop.apache.org/docs/stable/mapred_tutorial.htmlOn Wed, Nov 21, 2012 at 1:08 AM, jamal sasha <[EMAIL PROTECTED]> wrote: > > > Hi, > > I wrote a simple map reduce job in hadoop streaming. > > > > I am wondering if I am doing something wrong .. > > While number of mappers are projected to be around 1700.. reducers.. just 1? > > It’s couple of TB’s worth of data. > > What can I do to address this. > > Basically mapper looks like this > > > > For line in sys.stdin: > > Print line > > > > Reducer > > For line in sys.stdin: > > New_line = process_line(line) > > Print new_line > > > > > > Thanks > > -- Harsh J
+
Harsh J 2012-11-21, 04:08
|
|