Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> /tmp full, my google-fu weak


Copy link to this message
-
Re: /tmp full, my google-fu weak

Replace /tmp with a symlink into the disk you're trying to use?

-Kris

On Thu, Dec 16, 2010 at 10:27:19PM +0100, David Vrensk wrote:
> Thanks, that's good to know.  Not to sound ungrateful, but is there any way
> to do this without changing pig versions?  It's not something I'm opposed
> to, but it's rather a bigger procedure than I was hoping for.
>
> /David
>
> On Thu, Dec 16, 2010 at 20:04, Richard Ding <[EMAIL PROTECTED]> wrote:
>
> >  Pig 0.8 allows you to specify its temp directory with -Dpig.temp.dir=<dir
> > path> command (PIG-103).
> >
> >
> >
> > On 12/16/10 8:18 AM, "David Vrensk" <[EMAIL PROTECTED]> wrote:
> >
> > Hello fellow pig users,
> >
> > I have told pig to use a separate disk for its temp files by setting
> > PIG_OPTS=-Dhadoop.tmp.dir=/mnt/hadoop-tmp but it still keeps a lot of its
> > files in /tmp:
> >
> > /tmp/temp-1035677529$ find . -type f -exec ls -lh '{}' \;
> > -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp82247880/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp82247880/part-00000
> > -rw-r--r-- 1 pig pig 8 2010-12-16 14:13 ./tmp-1431528563/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 0 2010-12-16 14:04 ./tmp-1431528563/part-00000
> > -rw-r--r-- 1 pig pig 3.0M 2010-12-16 14:01 ./tmp1746442640/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 381M 2010-12-16 14:01 ./tmp1746442640/part-00000
> > -rw-r--r-- 1 pig pig 8.8M 2010-12-16 16:05
> > ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 1.1G 2010-12-16 16:05
> > ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/part-00000
> > -rw-r--r-- 1 pig pig 38M 2010-12-16 14:13 ./tmp1280814018/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 4.8G 2010-12-16 14:13 ./tmp1280814018/part-00000
> > -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp1738480876/.part-00000.crc
> > -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp1738480876/part-00000
> >
> > I don't know what these files are and my google-fu is too weak to find
> > anything.
> >
> > FWIW, the command line I currently use to run pig is
> >
> > pig-0.6.0/bin/pig -param input=batch-20101216-130003/*
> > scripts/the_script.pig
> >
> > I'm looking for a way to make pig put all its files on /mnt/hadoop-tmp.
> > Preferrably, it should be a command line argument or an environment
> > variable
> > and not tweeking an xml file.  Not only will that make my scripts more
> > transparent, but the xml file I've heard about so far (hadoop-site.xml)
> > resides within the hadoop jar which is pre-built, and I'd rather avoid
> > cracking it open in order to modify its contents.  Preferred solution
> > aside,
> > I'm glad for any help!
> >
> > Thanks in advance,
> >
> > David
> >
> > --
> > David Vrensk
> > Systems developer, ICE House AB
> > Mobile: +46 703 74 69 00
> >
> >
>
>
> --
> David Vrensk
> Systems developer, ICE House AB
> Mobile: +46 703 74 69 00

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB