Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig 0.10.0 slow startup


+
Prashant Kommireddi 2012-08-07, 22:44
+
Jonathan Coveney 2012-08-08, 05:07
+
Chun Yang 2012-08-08, 19:01
+
Jonathan Coveney 2012-08-08, 19:22
+
Chun Yang 2012-08-08, 22:04
+
Jonathan Coveney 2012-08-09, 00:38
+
Chun Yang 2012-08-09, 00:51
Copy link to this message
-
Re: Pig 0.10.0 slow startup
Can you do me a favor and run the exact same stuff with pig11? Just to
isolate if this is an issue that has been removed. I will also try and run
this on pig10, to see if I can see te same issue.

2012/8/8 Chun Yang <[EMAIL PROTECTED]>

> Thanks Jonathan,
>
> Here are some numbers that I'm getting from Pig 0.10 and Pig 0.9.1:
>
> pig10 -b -e 'explain -script students-a.pig'  35.35s user 8.52s system 63%
> cpu 1:08.77 total
>
> pig10 -b -e 'explain -script students-b.pig'  5.32s user 0.48s system 130%
> cpu 4.460 total
>
> pig9 -b -e 'explain -script students-a.pig'  4.93s user 0.51s system 131%
> cpu 4.153 total
>
> pig9 -b -e 'explain -script students-b.pig'  3.86s user 0.41s system 131%
> cpu 3.254 total
>
> Seems like the first run is always slower, but subsequent runs are about
> the
> same:
>
> pig10 -b -e 'explain -script students-a.pig'  35.17s user 8.20s system 123%
> cpu 35.017 total
>
> pig10 -b -e 'explain -script students-a.pig'  35.41s user 8.55s system 122%
> cpu 35.803 total
>
> A little more than 1.5s slowdown :)
>
> Thanks,
> Chun
>
> On 8/8/12 5:38 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
>
> > Thanks for putting that together, Chun.
> >
> > So, it looks like there are ~400 instantiations of the class, and the
> time
> > from the first instantiation to the last one is about ~1.5s. Is that on
> the
> > order of the slowdown your experiencing?
> >
> > (note: I'm testing with Pig 11...if your slowdown is much higher than
> that,
> > I'll test on Pig 10)
> >
> > Either way, it seems like the slowdown is directly attributable to UDF
> > invocations. Have you seen slowdowns much larger than this?
> >
> > 2012/8/8 Chun Yang <[EMAIL PROTECTED]>
> >
> >> Hi Jonathan,
> >>
> >> Here is a more self-contained example than what I had before:
> >> http://ews.illinois.edu/~yang43/shared/students.tar.gz
> >>
> >> I wrote a trivial GFV class, but the slowdown still exists.
> >> students-a.pig starts up noticeably slower than students-b.pig .
> >>
> >> Thanks,
> >> Chun
> >>
> >> On 8/8/12 12:22 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
> >>
> >>> Thanks for this info. Can you go ahead and paste the whole GFV class?
> >>>
> >>> Thanks
> >>>
> >>> 2012/8/8 Chun Yang <[EMAIL PROTECTED]>
> >>>
> >>>> Thanks Jonathan,
> >>>>
> >>>> I've tried to produce an example script which exhibits the slowdown
> and
> >>>> posted it on Pastebin: http://pastebin.com/kTSsDUr3
> >>>>
> >>>> The slowdown seems to occur when we are using a lot of UDFs to parse
> our
> >>>> input data. Variant A in the script is noticeably slower than variant
> B
> >> in
> >>>> Pig 0.10 while performance is similar in Pig 0.9.1
> >>>>
> >>>> I've pasted the exec() function of the GFV function on Pastebin as
> well:
> >>>> http://pastebin.com/FVnkQCJ5
> >>>>
> >>>> Please let us know if you need more details.
> >>>>
> >>>> Thanks,
> >>>> Chun
> >>>>
> >>>> On 8/7/12 10:07 PM, "Jonathan Coveney" <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>>> Can you guys give a script that has the issue? My tactic would be to
> >> use
> >>>>> some sort of profiler (we have access to YourKit for open source Pig
> >>>>> contribution work) and try and isolate what is triggering GC.
> >>>>>
> >>>>> 2012/8/7 Prashant Kommireddi <[EMAIL PROTECTED]>
> >>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> Just wanted to follow-up on Chun's question. Several of our Pig
> users
> >>>> have
> >>>>>> been experiencing slow start-ups with Pig 0.10.0, when the same
> script
> >>>> runs
> >>>>>> fine with 0.9.1. Anyone else facing similar issues?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Prashant
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> I'm trying to move from Pig 0.9.1 to Pig 0.10.0 . When I try to run
> >> the
> >>>>>> same
> >>>>>> script using the two Pig versions, 0.9.1 starts off fast and almost
> >>>>>> immediately submits the job to the cluster. On the other hand, Pig
> >>>> 0.10.0
> >>>>>> takes forever to submit the job. When I use the java option
+
Chun Yang 2012-08-09, 22:32
+
Prashant Kommireddi 2012-08-10, 20:15
+
Dmitriy Ryaboy 2012-08-13, 23:44
+
Chun Yang 2012-07-26, 22:32