Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Job setup for a pig run takes ages


Copy link to this message
-
RE: Job setup for a pig run takes ages
I took some snapshots, the results are attached.

I use pig 0.8.1
[dli@hmaster run]$ pig -version
Apache Pig version 0.8.1-cdh3u2 (rexported)
compiled Oct 13 2011, 22:35:57

and default pig loader

A = load 'a.csv' USING PIGSTORAGE('|') AS ( ...);

Thanks.
Dan

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Monday, June 18, 2012 4:26 PM
To: [EMAIL PROTECTED]
Subject: Re: Job setup for a pig run takes ages

Can you do a few successive ones?
Also, please let us know which version of pig you are using, and which loaders.

D

On Mon, Jun 18, 2012 at 2:51 PM, Danfeng Li <[EMAIL PROTECTED]> wrote:
> This is the jstack output during the setup time, not exactly sure how to interoperate it.
>
> Thanks.
> Dan
>
> [dli@hmaster run]$ jstack 15640
> 2012-06-18 17:32:47
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode):
>
> "Attach Listener" daemon prio=10 tid=0x0000000055dcb800 nid=0x431d
> waiting on condition [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "Low Memory Detector" daemon prio=10 tid=0x0000000055105000 nid=0x3d3b
> runnable [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "CompilerThread1" daemon prio=10 tid=0x0000000055103000 nid=0x3d3a
> waiting on condition [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "CompilerThread0" daemon prio=10 tid=0x0000000055100000 nid=0x3d39
> waiting on condition [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "Signal Dispatcher" daemon prio=10 tid=0x00000000550fe000 nid=0x3d38
> runnable [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "Finalizer" daemon prio=10 tid=0x00000000550de800 nid=0x3d37 in
> Object.wait() [0x0000000041d7a000]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        - waiting on <0x00002aaab48a3cf8> (a
> java.lang.ref.ReferenceQueue$Lock)
>        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>        - locked <0x00002aaab48a3cf8> (a
> java.lang.ref.ReferenceQueue$Lock)
>        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>        at
> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>
> "Reference Handler" daemon prio=10 tid=0x00000000550dc800 nid=0x3d36
> in Object.wait() [0x0000000041093000]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        - waiting on <0x00002aaab48a3cb0> (a
> java.lang.ref.Reference$Lock)
>        at java.lang.Object.wait(Object.java:485)
>        at
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>        - locked <0x00002aaab48a3cb0> (a java.lang.ref.Reference$Lock)
>
> "main" prio=10 tid=0x0000000055065800 nid=0x3d25 runnable
> [0x0000000041653000]
>   java.lang.Thread.State: RUNNABLE
>        at
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSc
> hema(ProjectExpression.java:164)
>        at
> org.apache.pig.newplan.logical.relational.LOInnerLoad.getSchema(LOInne
> rLoad.java:59)
>        at
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaRe
> setter.java:114)
>        at
> org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLo
> ad.java:109)
>        at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalke
> r.java:75)
>        at
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaRe
> setter.java:94)
>        at
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.j
> ava:71)
>        at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalke
> r.java:75)
>        at
> org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>        at
> org.apache.pig.newplan.logical.optimizer.SchemaPatcher.transformed(Sch
> emaPatcher.java:43)
>        at
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.
> java:113)
>        at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB