Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - /tmp full, my google-fu weak


Copy link to this message
-
Re: /tmp full, my google-fu weak
Kris Coward 2010-12-16, 21:41

Replace /tmp with a symlink into the disk you're trying to use?

-Kris

On Thu, Dec 16, 2010 at 10:27:19PM +0100, David Vrensk wrote:
> Thanks, that's good to know.  Not to sound ungrateful, but is there any way
> to do this without changing pig versions?  It's not something I'm opposed
> to, but it's rather a bigger procedure than I was hoping for.
>
> /David
>
> On Thu, Dec 16, 2010 at 20:04, Richard Ding <[EMAIL PROTECTED]> wrote:
>
> >  Pig 0.8 allows you to specify its temp directory with -Dpig.temp.dir=<dir
> > path> command (PIG-103).
> >
> >
> >
> > On 12/16/10 8:18 AM, "David Vrensk" <[EMAIL PROTECTED]> wrote:
> >
> > Hello fellow pig users,
> >
> > I have told pig to use a separate disk for its temp files by setting
> > PIG_OPTS=-Dhadoop.tmp.dir=/mnt/hadoop-tmp but it still keeps a lot of its
> > files in /tmp:
> >
> > /tmp/temp-1035677529$ find . -type f -exec ls -lh '{}' \;
> > -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp82247880/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp82247880/part-00000
> > -rw-r--r-- 1 pig pig 8 2010-12-16 14:13 ./tmp-1431528563/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 0 2010-12-16 14:04 ./tmp-1431528563/part-00000
> > -rw-r--r-- 1 pig pig 3.0M 2010-12-16 14:01 ./tmp1746442640/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 381M 2010-12-16 14:01 ./tmp1746442640/part-00000
> > -rw-r--r-- 1 pig pig 8.8M 2010-12-16 16:05
> > ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 1.1G 2010-12-16 16:05
> > ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/part-00000
> > -rw-r--r-- 1 pig pig 38M 2010-12-16 14:13 ./tmp1280814018/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 4.8G 2010-12-16 14:13 ./tmp1280814018/part-00000
> > -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp1738480876/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp1738480876/part-00000
> >
> > I don't know what these files are and my google-fu is too weak to find
> > anything.
> >
> > FWIW, the command line I currently use to run pig is
> >
> > pig-0.6.0/bin/pig -param input=batch-20101216-130003/*
> > scripts/the_script.pig
> >
> > I'm looking for a way to make pig put all its files on /mnt/hadoop-tmp.
> > Preferrably, it should be a command line argument or an environment
> > variable
> > and not tweeking an xml file.  Not only will that make my scripts more
> > transparent, but the xml file I've heard about so far (hadoop-site.xml)
> > resides within the hadoop jar which is pre-built, and I'd rather avoid
> > cracking it open in order to modify its contents.  Preferred solution
> > aside,
> > I'm glad for any help!
> >
> > Thanks in advance,
> >
> > David
> >
> > --
> > David Vrensk
> > Systems developer, ICE House AB
> > Mobile: +46 703 74 69 00
> >
> >
>
>
> --
> David Vrensk
> Systems developer, ICE House AB
> Mobile: +46 703 74 69 00

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3