|
Yu Li
2010-07-01, 07:42
Srigurunath Chakravarthi
2010-07-01, 09:17
Yu Li
2010-07-01, 09:33
Ted Yu
2010-07-01, 16:16
Yu Li
2010-07-01, 16:38
Ted Yu
2010-07-01, 16:52
Yu Li
2010-07-01, 17:22
|
-
In which configuration file to configure the "fs.inmemory.size.mb" parameter?Yu Li 2010-07-01, 07:42
Hi all,
I looked through the "Cluster Setup" guide under link http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and found there's a "fs.inmemory.size.mb" parameter for specifying memory allocated for the in-memory file-system used to merge map-outputs at the reduces, and this parameter is set in the "core-site.xml". But when I checked the "core-default.xml" under path "$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor could I find the parameter through JTUI after lauching jobs. Does anybody know about this parameter? Has it been removed from release 0.20.X? If it hasn't been removed, how could I set the parameter besides using the -D option? Thanks in advance. Best Regards, Carp
-
RE: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Srigurunath Chakravarthi 2010-07-01, 09:17
Carp,
IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of io.sort.mb. In the reducer tasks, intermediate map output is collected into a buffer (who size is governed by this parameter's value), and data is flushed into files as (partially) sorted KVs. These files will be re-merged if we end up with more than io.sort.factor number of files, else KVs will be served out of these files to the reduce function directly. I don't know where in the code it is though, sorry. cheers, Sriguru >-----Original Message----- >From: Yu Li [mailto:[EMAIL PROTECTED]] >Sent: Thursday, July 01, 2010 1:12 PM >To: [EMAIL PROTECTED] >Subject: In which configuration file to configure the >"fs.inmemory.size.mb" parameter? > >Hi all, > >I looked through the "Cluster Setup" guide under link >http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and >found there's a "fs.inmemory.size.mb" parameter for specifying memory >allocated for the in-memory file-system used to merge map-outputs at >the reduces, and this parameter is set in the "core-site.xml". But >when I checked the "core-default.xml" under path >"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor >could I find the parameter through JTUI after lauching jobs. >Does anybody know about this parameter? Has it been removed from >release 0.20.X? If it hasn't been removed, how could I set the >parameter besides using the -D option? Thanks in advance. > >Best Regards, >Carp
-
Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Yu Li 2010-07-01, 09:33
Hi Sriguru,
Thanks for your comments. Do you know how to set this parameter? Best Regards, Carp 2010/7/1 Srigurunath Chakravarthi <[EMAIL PROTECTED]>: > Carp, > IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of io.sort.mb. In the reducer tasks, intermediate map output is collected into a buffer (who size is governed by this parameter's value), and data is flushed into files as (partially) sorted KVs. > > These files will be re-merged if we end up with more than io.sort.factor number of files, else KVs will be served out of these files to the reduce function directly. > > I don't know where in the code it is though, sorry. > > cheers, > Sriguru > > >>-----Original Message----- >>From: Yu Li [mailto:[EMAIL PROTECTED]] >>Sent: Thursday, July 01, 2010 1:12 PM >>To: [EMAIL PROTECTED] >>Subject: In which configuration file to configure the >>"fs.inmemory.size.mb" parameter? >> >>Hi all, >> >>I looked through the "Cluster Setup" guide under link >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and >>found there's a "fs.inmemory.size.mb" parameter for specifying memory >>allocated for the in-memory file-system used to merge map-outputs at >>the reduces, and this parameter is set in the "core-site.xml". But >>when I checked the "core-default.xml" under path >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor >>could I find the parameter through JTUI after lauching jobs. >>Does anybody know about this parameter? Has it been removed from >>release 0.20.X? If it hasn't been removed, how could I set the >>parameter besides using the -D option? Thanks in advance. >> >>Best Regards, >>Carp >
-
Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Ted Yu 2010-07-01, 16:16
I found https://issues.apache.org/jira/browse/HADOOP-6812
You can add the following to core-site.xml: <property> <name>fs.inmemory.size.mb</name> <value>100</value> </property> Default value is 100: int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java On Thu, Jul 1, 2010 at 2:33 AM, Yu Li <[EMAIL PROTECTED]> wrote: > Hi Sriguru, > > Thanks for your comments. Do you know how to set this parameter? > > Best Regards, > Carp > > 2010/7/1 Srigurunath Chakravarthi <[EMAIL PROTECTED]>: > > Carp, > > IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of > io.sort.mb. In the reducer tasks, intermediate map output is collected into > a buffer (who size is governed by this parameter's value), and data is > flushed into files as (partially) sorted KVs. > > > > These files will be re-merged if we end up with more than io.sort.factor > number of files, else KVs will be served out of these files to the reduce > function directly. > > > > I don't know where in the code it is though, sorry. > > > > cheers, > > Sriguru > > > > > >>-----Original Message----- > >>From: Yu Li [mailto:[EMAIL PROTECTED]] > >>Sent: Thursday, July 01, 2010 1:12 PM > >>To: [EMAIL PROTECTED] > >>Subject: In which configuration file to configure the > >>"fs.inmemory.size.mb" parameter? > >> > >>Hi all, > >> > >>I looked through the "Cluster Setup" guide under link > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory > >>allocated for the in-memory file-system used to merge map-outputs at > >>the reduces, and this parameter is set in the "core-site.xml". But > >>when I checked the "core-default.xml" under path > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor > >>could I find the parameter through JTUI after lauching jobs. > >>Does anybody know about this parameter? Has it been removed from > >>release 0.20.X? If it hasn't been removed, how could I set the > >>parameter besides using the -D option? Thanks in advance. > >> > >>Best Regards, > >>Carp > > >
-
Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Yu Li 2010-07-01, 16:38
Hi Ted,
Thanks for your help. Another question is: if this parameter is not used in the 0.20.X release and the "Cluster Setup" is not updated, is there any parameter replacing this one? It's a useful parameter IMHO. Anyone knows about this? Thanks in advance! Best Regards, Carp 2010/7/2 Ted Yu <[EMAIL PROTECTED]> > I found https://issues.apache.org/jira/browse/HADOOP-6812 > > You can add the following to core-site.xml: > <property> > <name>fs.inmemory.size.mb</name> > <value>100</value> > </property> > > Default value is 100: > int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); > ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java > > On Thu, Jul 1, 2010 at 2:33 AM, Yu Li <[EMAIL PROTECTED]> wrote: > > > Hi Sriguru, > > > > Thanks for your comments. Do you know how to set this parameter? > > > > Best Regards, > > Carp > > > > 2010/7/1 Srigurunath Chakravarthi <[EMAIL PROTECTED]>: > > > Carp, > > > IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent > of > > io.sort.mb. In the reducer tasks, intermediate map output is collected > into > > a buffer (who size is governed by this parameter's value), and data is > > flushed into files as (partially) sorted KVs. > > > > > > These files will be re-merged if we end up with more than > io.sort.factor > > number of files, else KVs will be served out of these files to the reduce > > function directly. > > > > > > I don't know where in the code it is though, sorry. > > > > > > cheers, > > > Sriguru > > > > > > > > >>-----Original Message----- > > >>From: Yu Li [mailto:[EMAIL PROTECTED]] > > >>Sent: Thursday, July 01, 2010 1:12 PM > > >>To: [EMAIL PROTECTED] > > >>Subject: In which configuration file to configure the > > >>"fs.inmemory.size.mb" parameter? > > >> > > >>Hi all, > > >> > > >>I looked through the "Cluster Setup" guide under link > > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and > > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory > > >>allocated for the in-memory file-system used to merge map-outputs at > > >>the reduces, and this parameter is set in the "core-site.xml". But > > >>when I checked the "core-default.xml" under path > > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor > > >>could I find the parameter through JTUI after lauching jobs. > > >>Does anybody know about this parameter? Has it been removed from > > >>release 0.20.X? If it hasn't been removed, how could I set the > > >>parameter besides using the -D option? Thanks in advance. > > >> > > >>Best Regards, > > >>Carp > > > > > >
-
Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Ted Yu 2010-07-01, 16:52
fs.inmemory.size.mb is used in 0.20.2
See src/core/org/apache/hadoop/fs/InMemoryFileSystem.java: public void initialize(URI uri, Configuration conf) { setConf(conf); int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); this.fsSize = size * 1024L * 1024L; On Thu, Jul 1, 2010 at 9:38 AM, Yu Li <[EMAIL PROTECTED]> wrote: > Hi Ted, > > Thanks for your help. Another question is: if this parameter is not used in > the 0.20.X release and the "Cluster Setup" is not updated, is there any > parameter replacing this one? It's a useful parameter IMHO. > > Anyone knows about this? Thanks in advance! > > Best Regards, > Carp > > 2010/7/2 Ted Yu <[EMAIL PROTECTED]> > > > I found https://issues.apache.org/jira/browse/HADOOP-6812 > > > > You can add the following to core-site.xml: > > <property> > > <name>fs.inmemory.size.mb</name> > > <value>100</value> > > </property> > > > > Default value is 100: > > int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); > > ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java > > > > On Thu, Jul 1, 2010 at 2:33 AM, Yu Li <[EMAIL PROTECTED]> wrote: > > > > > Hi Sriguru, > > > > > > Thanks for your comments. Do you know how to set this parameter? > > > > > > Best Regards, > > > Carp > > > > > > 2010/7/1 Srigurunath Chakravarthi <[EMAIL PROTECTED]>: > > > > Carp, > > > > IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side > equivalent > > of > > > io.sort.mb. In the reducer tasks, intermediate map output is collected > > into > > > a buffer (who size is governed by this parameter's value), and data is > > > flushed into files as (partially) sorted KVs. > > > > > > > > These files will be re-merged if we end up with more than > > io.sort.factor > > > number of files, else KVs will be served out of these files to the > reduce > > > function directly. > > > > > > > > I don't know where in the code it is though, sorry. > > > > > > > > cheers, > > > > Sriguru > > > > > > > > > > > >>-----Original Message----- > > > >>From: Yu Li [mailto:[EMAIL PROTECTED]] > > > >>Sent: Thursday, July 01, 2010 1:12 PM > > > >>To: [EMAIL PROTECTED] > > > >>Subject: In which configuration file to configure the > > > >>"fs.inmemory.size.mb" parameter? > > > >> > > > >>Hi all, > > > >> > > > >>I looked through the "Cluster Setup" guide under link > > > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and > > > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory > > > >>allocated for the in-memory file-system used to merge map-outputs at > > > >>the reduces, and this parameter is set in the "core-site.xml". But > > > >>when I checked the "core-default.xml" under path > > > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor > > > >>could I find the parameter through JTUI after lauching jobs. > > > >>Does anybody know about this parameter? Has it been removed from > > > >>release 0.20.X? If it hasn't been removed, how could I set the > > > >>parameter besides using the -D option? Thanks in advance. > > > >> > > > >>Best Regards, > > > >>Carp > > > > > > > > > >
-
Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?Yu Li 2010-07-01, 17:22
Hi Ted,
Thanks for your help. I've seen the code under the path you mentioned, but it seems this class has been deprecated. Do you know any other parameter has similar functionality as this one? Others, any comments/suggestions? Thanks. Best Regards, Carp 2010/7/2 Ted Yu <[EMAIL PROTECTED]>: > fs.inmemory.size.mb is used in 0.20.2 > See src/core/org/apache/hadoop/fs/InMemoryFileSystem.java: > > public void initialize(URI uri, Configuration conf) { > setConf(conf); > int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); > this.fsSize = size * 1024L * 1024L; > > On Thu, Jul 1, 2010 at 9:38 AM, Yu Li <[EMAIL PROTECTED]> wrote: > >> Hi Ted, >> >> Thanks for your help. Another question is: if this parameter is not used in >> the 0.20.X release and the "Cluster Setup" is not updated, is there any >> parameter replacing this one? It's a useful parameter IMHO. >> >> Anyone knows about this? Thanks in advance! >> >> Best Regards, >> Carp >> >> 2010/7/2 Ted Yu <[EMAIL PROTECTED]> >> >> > I found https://issues.apache.org/jira/browse/HADOOP-6812 >> > >> > You can add the following to core-site.xml: >> > <property> >> > <name>fs.inmemory.size.mb</name> >> > <value>100</value> >> > </property> >> > >> > Default value is 100: >> > int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); >> > ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java >> > >> > On Thu, Jul 1, 2010 at 2:33 AM, Yu Li <[EMAIL PROTECTED]> wrote: >> > >> > > Hi Sriguru, >> > > >> > > Thanks for your comments. Do you know how to set this parameter? >> > > >> > > Best Regards, >> > > Carp >> > > >> > > 2010/7/1 Srigurunath Chakravarthi <[EMAIL PROTECTED]>: >> > > > Carp, >> > > > IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side >> equivalent >> > of >> > > io.sort.mb. In the reducer tasks, intermediate map output is collected >> > into >> > > a buffer (who size is governed by this parameter's value), and data is >> > > flushed into files as (partially) sorted KVs. >> > > > >> > > > These files will be re-merged if we end up with more than >> > io.sort.factor >> > > number of files, else KVs will be served out of these files to the >> reduce >> > > function directly. >> > > > >> > > > I don't know where in the code it is though, sorry. >> > > > >> > > > cheers, >> > > > Sriguru >> > > > >> > > > >> > > >>-----Original Message----- >> > > >>From: Yu Li [mailto:[EMAIL PROTECTED]] >> > > >>Sent: Thursday, July 01, 2010 1:12 PM >> > > >>To: [EMAIL PROTECTED] >> > > >>Subject: In which configuration file to configure the >> > > >>"fs.inmemory.size.mb" parameter? >> > > >> >> > > >>Hi all, >> > > >> >> > > >>I looked through the "Cluster Setup" guide under link >> > > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and >> > > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory >> > > >>allocated for the in-memory file-system used to merge map-outputs at >> > > >>the reduces, and this parameter is set in the "core-site.xml". But >> > > >>when I checked the "core-default.xml" under path >> > > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor >> > > >>could I find the parameter through JTUI after lauching jobs. >> > > >>Does anybody know about this parameter? Has it been removed from >> > > >>release 0.20.X? If it hasn't been removed, how could I set the >> > > >>parameter besides using the -D option? Thanks in advance. >> > > >> >> > > >>Best Regards, >> > > >>Carp >> > > > >> > > >> > >> > |