|
|
-
Re: Creating and working with temporary file in a map() functionOndřej Klimpera 2012-04-08, 18:10
Thanks for your advise, File.createTempFile() works great, at least in
pseudo-ditributed mode, hope cluster solution will do the same work. You saved me hours of trying... On 04/07/2012 11:29 PM, Harsh J wrote: > MapReduce sets "mapred.child.tmp" for all tasks to be the Task > Attempt's WorkingDir/tmp automatically. This also sets the > -Djava.io.tmpdir prop for each task at JVM boot. > > Hence you may use the regular Java API to create a temporary file: > http://docs.oracle.com/javase/6/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String) > > These files would also be automatically deleted away after the task > attempt is done. > > On Sun, Apr 8, 2012 at 2:14 AM, Ondřej Klimpera<[EMAIL PROTECTED]> wrote: >> Hello, >> >> I would like to ask you if it is possible to create and work with a >> temporary file while in a map function. >> >> I suppose that map function is running on a single node in Hadoop cluster. >> So what is a safe way to create a temporary file and read from it in one >> map() run. If it is possible is there a size limit for the file. >> >> The file can not be created before hadoop job is created. I need to create >> and process the file inside map(). >> >> Thanks for your answer. >> >> Ondrej Klimpera. > > |