Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig 0.10.0 slow startup


+
Prashant Kommireddi 2012-08-07, 22:44
+
Jonathan Coveney 2012-08-08, 05:07
+
Chun Yang 2012-08-08, 19:01
+
Jonathan Coveney 2012-08-08, 19:22
+
Chun Yang 2012-08-08, 22:04
+
Jonathan Coveney 2012-08-09, 00:38
+
Chun Yang 2012-08-09, 00:51
Copy link to this message
-
Re: Pig 0.10.0 slow startup
Can you do me a favor and run the exact same stuff with pig11? Just to
isolate if this is an issue that has been removed. I will also try and run
this on pig10, to see if I can see te same issue.

2012/8/8 Chun Yang <[EMAIL PROTECTED]>

> Thanks Jonathan,
>
> Here are some numbers that I'm getting from Pig 0.10 and Pig 0.9.1:
>
> pig10 -b -e 'explain -script students-a.pig'  35.35s user 8.52s system 63%
> cpu 1:08.77 total
>
> pig10 -b -e 'explain -script students-b.pig'  5.32s user 0.48s system 130%
> cpu 4.460 total
>
> pig9 -b -e 'explain -script students-a.pig'  4.93s user 0.51s system 131%
> cpu 4.153 total
>
> pig9 -b -e 'explain -script students-b.pig'  3.86s user 0.41s system 131%
> cpu 3.254 total
>
> Seems like the first run is always slower, but subsequent runs are about
> the
> same:
>
> pig10 -b -e 'explain -script students-a.pig'  35.17s user 8.20s system 123%
> cpu 35.017 total
>
> pig10 -b -e 'explain -script students-a.pig'  35.41s user 8.55s system 122%
> cpu 35.803 total
>
> A little more than 1.5s slowdown :)
>
> Thanks,
> Chun
>
> On 8/8/12 5:38 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
>
> > Thanks for putting that together, Chun.
> >
> > So, it looks like there are ~400 instantiations of the class, and the
> time
> > from the first instantiation to the last one is about ~1.5s. Is that on
> the
> > order of the slowdown your experiencing?
> >
> > (note: I'm testing with Pig 11...if your slowdown is much higher than
> that,
> > I'll test on Pig 10)
> >
> > Either way, it seems like the slowdown is directly attributable to UDF
> > invocations. Have you seen slowdowns much larger than this?
> >
> > 2012/8/8 Chun Yang <[EMAIL PROTECTED]>
> >
> >> Hi Jonathan,
> >>
> >> Here is a more self-contained example than what I had before:
> >> http://ews.illinois.edu/~yang43/shared/students.tar.gz
> >>
> >> I wrote a trivial GFV class, but the slowdown still exists.
> >> students-a.pig starts up noticeably slower than students-b.pig .
> >>
> >> Thanks,
> >> Chun
> >>
> >> On 8/8/12 12:22 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
> >>
> >>> Thanks for this info. Can you go ahead and paste the whole GFV class?
> >>>
> >>> Thanks
> >>>
> >>> 2012/8/8 Chun Yang <[EMAIL PROTECTED]>
> >>>
> >>>> Thanks Jonathan,
> >>>>
> >>>> I've tried to produce an example script which exhibits the slowdown
> and
> >>>> posted it on Pastebin: http://pastebin.com/kTSsDUr3
> >>>>
> >>>> The slowdown seems to occur when we are using a lot of UDFs to parse
> our
> >>>> input data. Variant A in the script is noticeably slower than variant
> B
> >> in
> >>>> Pig 0.10 while performance is similar in Pig 0.9.1
> >>>>
> >>>> I've pasted the exec() function of the GFV function on Pastebin as
> well:
> >>>> http://pastebin.com/FVnkQCJ5
> >>>>
> >>>> Please let us know if you need more details.
> >>>>
> >>>> Thanks,
> >>>> Chun
> >>>>
> >>>> On 8/7/12 10:07 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>>> Can you guys give a script that has the issue? My tactic would be to
> >> use
> >>>>> some sort of profiler (we have access to YourKit for open source Pig
> >>>>> contribution work) and try and isolate what is triggering GC.
> >>>>>
> >>>>> 2012/8/7 Prashant Kommireddi <[EMAIL PROTECTED]>
> >>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> Just wanted to follow-up on Chun's question. Several of our Pig
> users
> >>>> have
> >>>>>> been experiencing slow start-ups with Pig 0.10.0, when the same
> script
> >>>> runs
> >>>>>> fine with 0.9.1. Anyone else facing similar issues?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Prashant
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> I'm trying to move from Pig 0.9.1 to Pig 0.10.0 . When I try to run
> >> the
> >>>>>> same
> >>>>>> script using the two Pig versions, 0.9.1 starts off fast and almost
> >>>>>> immediately submits the job to the cluster. On the other hand, Pig
> >>>> 0.10.0
> >>>>>> takes forever to submit the job. When I use the java option
+
Chun Yang 2012-08-09, 22:32
+
Prashant Kommireddi 2012-08-10, 20:15
+
Dmitriy Ryaboy 2012-08-13, 23:44
+
Chun Yang 2012-07-26, 22:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB