Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> [Changing subject] Pig setup 0.10.0 vs 0.9.1


Copy link to this message
-
[Changing subject] Pig setup 0.10.0 vs 0.9.1
Aaah I think that clue helped. This was my setup:

HADOOP_HOME was set in .bashrc

In my wrapper script I unset HADOOP_HOME for pig to pick up the bundled
version 0.20.2. This works for 0.9.1.

0.10.0 however picks up HADOOP_HOME though I unset it in my wrapper script.
Ofcourse, it worked after removing HADOOP_HOME from .bashrc altogether.
Though now I see an error: ERROR 2999: Unexpected internal error. Failed to
create DataStorage

This is the output of secretDebugCmd

Cannot find local hadoop installation, using bundled hadoop 20.2
dry run:
/home/pkommireddi/dev/tools/Linux/jdk/jdk1.6.0_21_x64/bin/java -Xmx1000m
-Djava.library.path=/home/pkommireddi/dev/tools/Linux/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64
-Dpig.log.dir=/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/../logs
-Dpig.log.file=pig.log
-Dpig.home.dir=/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/..
-classpath
/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/../conf:/home/pkommireddi/dev/tools/Linux/jdk/jdk1.6.0_21_x64/lib/tools.jar:/home/pkommireddi/dev/apps/gridforce/main/hadoop/conf/dev:/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/../lib/automaton.jar:/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/../lib/jython-2.5.0.jar:/home/pkommireddi/dev/tools/Linux/hadoop/pig-0.10.0/bin/../pig-0.10.0.jar
org.apache.pig.Main
On Tue, Apr 24, 2012 at 12:09 AM, Daniel Dai <[EMAIL PROTECTED]> wrote:

> Do you have HADOOP_HOME? Both HADOOP_CONF_DIR/PIG_CLASSPATH should
> work, can you use bin/pig -secretDebugCmd to check hadoop command
> line?
>
> On Mon, Apr 23, 2012 at 11:32 PM, Prashant Kommireddi
> <[EMAIL PROTECTED]> wrote:
> > Thanks Dmitriy, that works. But I am wondering why the behavior is
> > different from the previous versions.
> >
> > Difference I see in bin/pig is (0.10.0 vs 0.9.1)
> >
> >> # add HADOOP_CONF_DIR
> >> if [ "$HADOOP_CONF_DIR" != "" ]; then
> >>     CLASSPATH=${CLASSPATH}:${HADOOP_CONF_DIR}
> >> fi
> >
> > AFAIK, this should not affect it - all it's doing is adding the conf dir
> to
> > the classpath which I was doing earlier through PIG_CLASSPATH in my
> wrapper
> > script.
> >
> > The issue here is that certain properties are not same between client
> > machine and remote cluster, for eg JAVA_HOME. Since pig is client side it
> > made sense for Pig to not pick up any cluster properties from
> > "hadoop-env.sh". I am not sure what the change here is that's now causing
> > it to be picked up.
> >
> >
> >
> > On Mon, Apr 23, 2012 at 9:14 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
> >
> >> pig.sh understands the arguments below -- try setting HADOOP_CONF_DIR?
> >>
> >> # Environment Variables
> >> #
> >> #     JAVA_HOME                The java implementation to use.
> >> Overrides JAVA_HOME.
> >> #
> >> #     PIG_CLASSPATH Extra Java CLASSPATH entries.
> >> #
> >> #     HADOOP_HOME/HADOOP_PREFIX     Environment
> >> HADOOP_HOME/HADOOP_PREFIX(0.20.205)
> >> #
> >> #     HADOOP_CONF_DIR     Hadoop conf dir
> >> #
> >> #     PIG_HEAPSIZE    The maximum amount of heap to use, in MB.
> >> #                                        Default is 1000.
> >> #
> >> #     PIG_OPTS            Extra Java runtime options.
> >> #
> >> #     PIG_CONF_DIR    Alternate conf dir. Default is ${PIG_HOME}/conf.
> >> #
> >> #     HBASE_CONF_DIR - Optionally, the HBase configuration to run
> against
> >> #                      when using HBaseStorage
> >>
> >>
> >>
> >> On Mon, Apr 23, 2012 at 8:45 PM, Prashant Kommireddi
> >> <[EMAIL PROTECTED]> wrote:
> >> > I have a wrapper script to switch between Pig versions and clusters.
> >> >
> >> > export PIG_HOME=$HOME/tools/Linux/hadoop/pig-$PIG_VERSION
> >> > export JAVA_HOME=$HOME/tools/Linux/jdk/jdk$JAVA_VERSION/
> >> > export
> >> PIG_CLASSPATH=$HOME/apps/gridforce/main/hadoop/conf/$HADOOP_CLUSTER
> >> >
> >> > HADOOP_CLUSTER contains the hadoop configs (endpoints) for the
> cluster I
> >> > want to point to.
> >> >
> >> > And then I do this to start pig.
> >> >