|
Ben Clay
2011-08-27, 16:33
Uma Maheswara Rao G 72686...
2011-08-27, 19:07
Ted Dunning
2011-08-27, 19:42
Allen Wittenauer
2011-08-27, 22:41
Ben Clay
2011-08-27, 23:08
Aaron T. Myers
2011-08-27, 23:23
Praveen Sripati
2011-08-28, 03:57
Uma Maheswara Rao G 72686...
2011-08-28, 12:03
Ben Clay
2011-08-30, 10:54
Todd Lipcon
2011-08-30, 14:45
Ben Clay
2011-08-30, 15:07
|
-
set reduced block size for a specific fileBen Clay 2011-08-27, 16:33
I'd like to set a lowered block size for a specific file. IE, if HDFS is
configured to use 64mb blocks, I'd like to use 32mb blocks for a specific file. Is there a way to do this from the commandline, without writing a jar which uses org.apache.hadoop.fs.FileSystem.create() ? I tried the following, but it didn't work: hadoop fs -Ddfs.block.size=1048576 -put /local/path /remote/path I also tried -copyFromLocal. It looks like the -D is being ignored. Thanks. -Ben
-
Re: set reduced block size for a specific fileUma Maheswara Rao G 72686... 2011-08-27, 19:07
Hi Ben,
Currently there is no way to specify the blocksize from command line in Hadoop. Why can't you write the file from java program? Is there any use case for you to write some files only from command line? Regards, Uma ----- Original Message ----- From: Ben Clay <[EMAIL PROTECTED]> Date: Saturday, August 27, 2011 10:03 pm Subject: set reduced block size for a specific file To: [EMAIL PROTECTED] > I'd like to set a lowered block size for a specific file. IE, if > HDFS is > configured to use 64mb blocks, I'd like to use 32mb blocks for a > specificfile. > > > > Is there a way to do this from the commandline, without writing a > jar which > uses org.apache.hadoop.fs.FileSystem.create() ? > > > > I tried the following, but it didn't work: > > > > hadoop fs -Ddfs.block.size=1048576 -put /local/path /remote/path > > > > I also tried -copyFromLocal. It looks like the -D is being ignored. > > > > Thanks. > > > > -Ben > > > >
-
Re: set reduced block size for a specific fileTed Dunning 2011-08-27, 19:42
There is no way to do this for standard Apache Hadoop.
But other, otherwise Hadoop compatible, systems such as MapR do support this operation. Rather than push commercial systems on this mailing list, I would simply recommend anybody who is curious to email me. On Sat, Aug 27, 2011 at 12:07 PM, Uma Maheswara Rao G 72686 < [EMAIL PROTECTED]> wrote: > Hi Ben, > Currently there is no way to specify the blocksize from command line in > Hadoop. > > Why can't you write the file from java program? > Is there any use case for you to write some files only from command line? > > Regards, > Uma > > ----- Original Message ----- > From: Ben Clay <[EMAIL PROTECTED]> > Date: Saturday, August 27, 2011 10:03 pm > Subject: set reduced block size for a specific file > To: [EMAIL PROTECTED] > > > I'd like to set a lowered block size for a specific file. IE, if > > HDFS is > > configured to use 64mb blocks, I'd like to use 32mb blocks for a > > specificfile. > > > > > > > > Is there a way to do this from the commandline, without writing a > > jar which > > uses org.apache.hadoop.fs.FileSystem.create() ? > > > > > > > > I tried the following, but it didn't work: > > > > > > > > hadoop fs -Ddfs.block.size=1048576 -put /local/path /remote/path > > > > > > > > I also tried -copyFromLocal. It looks like the -D is being ignored. > > > > > > > > Thanks. > > > > > > > > -Ben > > > > > > > > >
-
Re: set reduced block size for a specific fileAllen Wittenauer 2011-08-27, 22:41
On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: > There is no way to do this for standard Apache Hadoop. Sure there is. You can build a custom conf dir and point it to that. You *always* have that option for client settable options as a work around for lack of features/bugs. 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir 2. modify the hdfs-site.xml to have your new block size 3. Run the following: HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir Convenient? No. Doable? Definitely.
-
RE: set reduced block size for a specific fileBen Clay 2011-08-27, 23:08
I didn't even think of overriding the config dir. Thanks for the tip!
-Ben -----Original Message----- From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] Sent: Saturday, August 27, 2011 6:42 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: set reduced block size for a specific file On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: > There is no way to do this for standard Apache Hadoop. Sure there is. You can build a custom conf dir and point it to that. You *always* have that option for client settable options as a work around for lack of features/bugs. 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir 2. modify the hdfs-site.xml to have your new block size 3. Run the following: HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir Convenient? No. Doable? Definitely.
-
Re: set reduced block size for a specific fileAaron T. Myers 2011-08-27, 23:23
Hey Ben,
I just filed this JIRA to add this feature: https://issues.apache.org/jira/browse/HDFS-2293 If anyone would like to implement this, I would be happy to review it. Thanks a lot, Aaron -- Aaron T. Myers Software Engineer, Cloudera On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: > I didn't even think of overriding the config dir. Thanks for the tip! > > -Ben > > > -----Original Message----- > From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] > Sent: Saturday, August 27, 2011 6:42 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: Re: set reduced block size for a specific file > > > On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: > > > There is no way to do this for standard Apache Hadoop. > > Sure there is. > > You can build a custom conf dir and point it to that. You *always* > have that option for client settable options as a work around for lack of > features/bugs. > > 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir > 2. modify the hdfs-site.xml to have your new block size > 3. Run the following: > > HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir > > Convenient? No. Doable? Definitely. > > > >
-
Re: set reduced block size for a specific filePraveen Sripati 2011-08-28, 03:57
Hi,
There are tons of parameters for mapreduce. How to know if a property is a client or serve side property? Thanks, Praveen On Sun, Aug 28, 2011 at 4:53 AM, Aaron T. Myers <[EMAIL PROTECTED]> wrote: > Hey Ben, > > I just filed this JIRA to add this feature: > https://issues.apache.org/jira/browse/HDFS-2293 > > If anyone would like to implement this, I would be happy to review it. > > Thanks a lot, > Aaron > > -- > Aaron T. Myers > Software Engineer, Cloudera > > > > On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: > >> I didn't even think of overriding the config dir. Thanks for the tip! >> >> -Ben >> >> >> -----Original Message----- >> From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] >> Sent: Saturday, August 27, 2011 6:42 PM >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> Subject: Re: set reduced block size for a specific file >> >> >> On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: >> >> > There is no way to do this for standard Apache Hadoop. >> >> Sure there is. >> >> You can build a custom conf dir and point it to that. You *always* >> have that option for client settable options as a work around for lack of >> features/bugs. >> >> 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir >> 2. modify the hdfs-site.xml to have your new block size >> 3. Run the following: >> >> HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir >> >> Convenient? No. Doable? Definitely. >> >> >> >> >
-
Re: set reduced block size for a specific fileUma Maheswara Rao G 72686... 2011-08-28, 12:03
Hi Ben,
I just verified it on trunk, -D option support already there in Hadoop. /** * Print the usage message for generic command-line options supported. * * @param out stream to print the usage message to. */ public static void printGenericCommandUsage(PrintStream out) { out.println("Generic options supported are"); out.println("-conf <configuration file> specify an application configuration file"); out.println("-D <property=value> use value for given property"); out.println("-fs <local|namenode:port> specify a namenode"); out.println("-jt <local|jobtracker:port> specify a job tracker"); out.println("-files <comma separated list of files> " + "specify comma separated files to be copied to the map reduce cluster"); out.println("-libjars <comma separated list of jars> " + "specify comma separated jar files to include in the classpath."); out.println("-archives <comma separated list of archives> " + "specify comma separated archives to be unarchived" + " on the compute machines.\n"); out.println("The general command line syntax is"); out.println("bin/hadoop command [genericOptions] [commandOptions]\n"); } Which version of hadoop you are running? As part of below JIRA , i will post the tests. You can have a look. Regards, Uma > On Sun, Aug 28, 2011 at 4:53 AM, Aaron T. Myers <[EMAIL PROTECTED]> > wrote: > > Hey Ben, > > > > I just filed this JIRA to add this feature: > > https://issues.apache.org/jira/browse/HDFS-2293 > > > > If anyone would like to implement this, I would be happy to > review it. > > > > Thanks a lot, > > Aaron > > > > -- > > Aaron T. Myers > > Software Engineer, Cloudera > > > > > > > > On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: > > > >> I didn't even think of overriding the config dir. Thanks for > the tip! > >> > >> -Ben > >> > >> > >> -----Original Message----- > >> From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] > >> Sent: Saturday, August 27, 2011 6:42 PM > >> To: [EMAIL PROTECTED] > >> Cc: [EMAIL PROTECTED] > >> Subject: Re: set reduced block size for a specific file > >> > >> > >> On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: > >> > >> > There is no way to do this for standard Apache Hadoop. > >> > >> Sure there is. > >> > >> You can build a custom conf dir and point it to that. > You *always* > >> have that option for client settable options as a work around > for lack of > >> features/bugs. > >> > >> 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir > >> 2. modify the hdfs-site.xml to have your new block size > >> 3. Run the following: > >> > >> HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir > >> > >> Convenient? No. Doable? Definitely. > >> > >> > >> > >> > > >
-
RE: set reduced block size for a specific fileBen Clay 2011-08-30, 10:54
Folks-
Thanks for the feedback, and sorry for the delay. I'm using 0.21 from http://hadoop.apache.org/, and have a default block size of 64mb. I'd like to copy the file with a 16mb block size. I tried a couple different conventions, but it's not taking my override: bin/hadoop dfs -D dfs.block.size=16777216 -copyFromLocal /src/file /dest/file bin/hadoop fsck /dest/file ... Status: HEALTHY Total size: 29556838357 B Total dirs: 0 Total files: 1 Total blocks (validated): 441 (avg. block size 67022309 B) ... bin/hadoop dfs -rmr /dest/file bin/hadoop fs -Ddfs.block.size=16777216 -put /src/file /dest/file bin/hadoop fsck /dest/file ... Status: HEALTHY Total size: 29556838357 B Total dirs: 0 Total files: 1 Total blocks (validated): 441 (avg. block size 67022309 B) ... I will try Allen's config dir override shortly, but I cannot get the -D option to work on this installation. Is there some other way to test this functionality? -Ben -----Original Message----- From: Uma Maheswara Rao G 72686 [mailto:[EMAIL PROTECTED]] Sent: Sunday, August 28, 2011 8:03 AM To: [EMAIL PROTECTED] Subject: Re: set reduced block size for a specific file Hi Ben, I just verified it on trunk, -D option support already there in Hadoop. /** * Print the usage message for generic command-line options supported. * * @param out stream to print the usage message to. */ public static void printGenericCommandUsage(PrintStream out) { out.println("Generic options supported are"); out.println("-conf <configuration file> specify an application configuration file"); out.println("-D <property=value> use value for given property"); out.println("-fs <local|namenode:port> specify a namenode"); out.println("-jt <local|jobtracker:port> specify a job tracker"); out.println("-files <comma separated list of files> " + "specify comma separated files to be copied to the map reduce cluster"); out.println("-libjars <comma separated list of jars> " + "specify comma separated jar files to include in the classpath."); out.println("-archives <comma separated list of archives> " + "specify comma separated archives to be unarchived" + " on the compute machines.\n"); out.println("The general command line syntax is"); out.println("bin/hadoop command [genericOptions] [commandOptions]\n"); } Which version of hadoop you are running? As part of below JIRA , i will post the tests. You can have a look. Regards, Uma > On Sun, Aug 28, 2011 at 4:53 AM, Aaron T. Myers <[EMAIL PROTECTED]> > wrote: > > Hey Ben, > > > > I just filed this JIRA to add this feature: > > https://issues.apache.org/jira/browse/HDFS-2293 > > > > If anyone would like to implement this, I would be happy to > review it. > > > > Thanks a lot, > > Aaron > > > > -- > > Aaron T. Myers > > Software Engineer, Cloudera > > > > > > > > On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: > > > >> I didn't even think of overriding the config dir. Thanks for > the tip! > >> > >> -Ben > >> > >> > >> -----Original Message----- > >> From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] > >> Sent: Saturday, August 27, 2011 6:42 PM > >> To: [EMAIL PROTECTED] > >> Cc: [EMAIL PROTECTED] > >> Subject: Re: set reduced block size for a specific file > >> > >> > >> On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: > >> > >> > There is no way to do this for standard Apache Hadoop. > >> > >> Sure there is. > >> > >> You can build a custom conf dir and point it to that. > You *always* > >> have that option for client settable options as a work around > for lack of > >> features/bugs. > >> > >> 1. Copy $HADOOP_CONF_DIR or $HADOOP_HOME/conf to a dir > >> 2. modify the hdfs-site.xml to have your new block size > >> 3. Run the following: > >> > >> HADOOP_CONF_DIR=mycustomconf hadoop dfs -put file dir > >
-
Re: set reduced block size for a specific fileTodd Lipcon 2011-08-30, 14:45
0.21 has some bugs with config option deprecation that you're running
into. I wouldn't suggest using 0.21 -- see the warning on the release page that it was never stabilized. -Todd On Tue, Aug 30, 2011 at 3:54 AM, Ben Clay <[EMAIL PROTECTED]> wrote: > Folks- > > Thanks for the feedback, and sorry for the delay. > > I'm using 0.21 from http://hadoop.apache.org/, and have a default block size > of 64mb. I'd like to copy the file with a 16mb block size. I tried a couple > different conventions, but it's not taking my override: > > bin/hadoop dfs -D dfs.block.size=16777216 -copyFromLocal /src/file > /dest/file > > bin/hadoop fsck /dest/file > ... > Status: HEALTHY > Total size: 29556838357 B > Total dirs: 0 > Total files: 1 > Total blocks (validated): 441 (avg. block size 67022309 B) > ... > > bin/hadoop dfs -rmr /dest/file > > bin/hadoop fs -Ddfs.block.size=16777216 -put /src/file /dest/file > > bin/hadoop fsck /dest/file > ... > Status: HEALTHY > Total size: 29556838357 B > Total dirs: 0 > Total files: 1 > Total blocks (validated): 441 (avg. block size 67022309 B) > ... > > I will try Allen's config dir override shortly, but I cannot get the -D > option to work on this installation. Is there some other way to test this > functionality? > > -Ben > > > -----Original Message----- > From: Uma Maheswara Rao G 72686 [mailto:[EMAIL PROTECTED]] > Sent: Sunday, August 28, 2011 8:03 AM > To: [EMAIL PROTECTED] > Subject: Re: set reduced block size for a specific file > > Hi Ben, > > I just verified it on trunk, > -D option support already there in Hadoop. > > /** > * Print the usage message for generic command-line options supported. > * > * @param out stream to print the usage message to. > */ > public static void printGenericCommandUsage(PrintStream out) { > > out.println("Generic options supported are"); > out.println("-conf <configuration file> specify an application > configuration file"); > out.println("-D <property=value> use value for given > property"); > out.println("-fs <local|namenode:port> specify a namenode"); > out.println("-jt <local|jobtracker:port> specify a job tracker"); > out.println("-files <comma separated list of files> " + > "specify comma separated files to be copied to the map reduce > cluster"); > out.println("-libjars <comma separated list of jars> " + > "specify comma separated jar files to include in the classpath."); > out.println("-archives <comma separated list of archives> " + > "specify comma separated archives to be unarchived" + > " on the compute machines.\n"); > out.println("The general command line syntax is"); > out.println("bin/hadoop command [genericOptions] [commandOptions]\n"); > } > > Which version of hadoop you are running? > > As part of below JIRA , i will post the tests. You can have a look. > > Regards, > Uma > >> On Sun, Aug 28, 2011 at 4:53 AM, Aaron T. Myers <[EMAIL PROTECTED]> >> wrote: >> > Hey Ben, >> > >> > I just filed this JIRA to add this feature: >> > https://issues.apache.org/jira/browse/HDFS-2293 >> > >> > If anyone would like to implement this, I would be happy to >> review it. >> > >> > Thanks a lot, >> > Aaron >> > >> > -- >> > Aaron T. Myers >> > Software Engineer, Cloudera >> > >> > >> > >> > On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: >> > >> >> I didn't even think of overriding the config dir. Thanks for >> the tip! >> >> >> >> -Ben >> >> >> >> >> >> -----Original Message----- >> >> From: Allen Wittenauer [mailto:[EMAIL PROTECTED]] >> >> Sent: Saturday, August 27, 2011 6:42 PM >> >> To: [EMAIL PROTECTED] >> >> Cc: [EMAIL PROTECTED] >> >> Subject: Re: set reduced block size for a specific file >> >> >> >> >> >> On Aug 27, 2011, at 12:42 PM, Ted Dunning wrote: >> >> >> >> > There is no way to do this for standard Apache Hadoop. >> >> >> >> Sure there is. >> >> >> >> You can build a custom conf dir and point it to that. Todd Lipcon Software Engineer, Cloudera
-
RE: set reduced block size for a specific fileBen Clay 2011-08-30, 15:07
Todd-
Ouch. I'm stuck with 0.21 for the near future, so I'll just write a small app that copies a file using a different block size. For reference, the config dir override using the following command did not work either: HADOOP_CONF_DIR=mycustomconf bin/hadoop dfs -put /src/path /dest/path Ben Clay [EMAIL PROTECTED] -----Original Message----- From: Todd Lipcon [mailto:[EMAIL PROTECTED]] Sent: Tuesday, August 30, 2011 10:45 AM To: [EMAIL PROTECTED] Subject: Re: set reduced block size for a specific file 0.21 has some bugs with config option deprecation that you're running into. I wouldn't suggest using 0.21 -- see the warning on the release page that it was never stabilized. -Todd On Tue, Aug 30, 2011 at 3:54 AM, Ben Clay <[EMAIL PROTECTED]> wrote: > Folks- > > Thanks for the feedback, and sorry for the delay. > > I'm using 0.21 from http://hadoop.apache.org/, and have a default > block size of 64mb. I'd like to copy the file with a 16mb block size. > I tried a couple different conventions, but it's not taking my override: > > bin/hadoop dfs -D dfs.block.size=16777216 -copyFromLocal /src/file > /dest/file > > bin/hadoop fsck /dest/file > ... > Status: HEALTHY > Total size: 29556838357 B > Total dirs: 0 > Total files: 1 > Total blocks (validated): 441 (avg. block size 67022309 B) ... > > bin/hadoop dfs -rmr /dest/file > > bin/hadoop fs -Ddfs.block.size=16777216 -put /src/file /dest/file > > bin/hadoop fsck /dest/file > ... > Status: HEALTHY > Total size: 29556838357 B > Total dirs: 0 > Total files: 1 > Total blocks (validated): 441 (avg. block size 67022309 B) ... > > I will try Allen's config dir override shortly, but I cannot get the > -D option to work on this installation. Is there some other way to > test this functionality? > > -Ben > > > -----Original Message----- > From: Uma Maheswara Rao G 72686 [mailto:[EMAIL PROTECTED]] > Sent: Sunday, August 28, 2011 8:03 AM > To: [EMAIL PROTECTED] > Subject: Re: set reduced block size for a specific file > > Hi Ben, > > I just verified it on trunk, > -D option support already there in Hadoop. > > /** > * Print the usage message for generic command-line options supported. > * > * @param out stream to print the usage message to. > */ > public static void printGenericCommandUsage(PrintStream out) { > > out.println("Generic options supported are"); > out.println("-conf <configuration file> specify an application > configuration file"); > out.println("-D <property=value> use value for given > property"); > out.println("-fs <local|namenode:port> specify a namenode"); > out.println("-jt <local|jobtracker:port> specify a job > tracker"); > out.println("-files <comma separated list of files> " + > "specify comma separated files to be copied to the map reduce > cluster"); > out.println("-libjars <comma separated list of jars> " + > "specify comma separated jar files to include in the > classpath."); > out.println("-archives <comma separated list of archives> " + > "specify comma separated archives to be unarchived" + > " on the compute machines.\n"); > out.println("The general command line syntax is"); > out.println("bin/hadoop command [genericOptions] > [commandOptions]\n"); > } > > Which version of hadoop you are running? > > As part of below JIRA , i will post the tests. You can have a look. > > Regards, > Uma > >> On Sun, Aug 28, 2011 at 4:53 AM, Aaron T. Myers <[EMAIL PROTECTED]> >> wrote: >> > Hey Ben, >> > >> > I just filed this JIRA to add this feature: >> > https://issues.apache.org/jira/browse/HDFS-2293 >> > >> > If anyone would like to implement this, I would be happy to >> review it. >> > >> > Thanks a lot, >> > Aaron >> > >> > -- >> > Aaron T. Myers >> > Software Engineer, Cloudera >> > >> > >> > >> > On Sat, Aug 27, 2011 at 4:08 PM, Ben Clay <[EMAIL PROTECTED]> wrote: >> > >> >> I didn't even think of overriding the config dir. Thanks for Todd Lipcon Software Engineer, Cloudera |