Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Pig UT last nearly 8 hours and TestEvalPipeline2 lasts for 37 minutes


+
lulynn_2008 2012-11-14, 02:28
+
Johnny Zhang 2012-11-14, 02:37
+
lulynn_2008 2012-11-14, 03:27
+
Johnny Zhang 2012-11-14, 19:12
+
Jonathan Coveney 2012-11-15, 00:55
+
lulynn_2008 2012-11-20, 06:57
Copy link to this message
-
Re: Re: Re: Pig UT last nearly 8 hours and TestEvalPipeline2 lasts for 37 minutes
Cheolsoo Park 2012-11-20, 17:39
Hi,

I actually tried to run entire unit test suite in multiple threads, and I
used this junit extension:
http://tempusfugitlibrary.org/documentation/junit/parallel/

This library is nice because all I had to do is to add @RunWith to each
test suite, and it seems to work.

However, I ran into several problems as follows:

1) Pig is not entirely thread-safe as far as I can tell. Yes, some work
done was done in the past to make Pig thread safe, but that's limited to
the front-end. For example,
https://issues.apache.org/jira/browse/PIG-1874

But to run unit tests in multiple threads, we need to ensure that both
front-end and back-end are thread safe. But they're apparently not. For
example, here a reported race condition in local mode:
http://search-hadoop.com/m/2OdLNRMwXa2/Intermittent+NullPointerException&subj=Intermittent+NullPointerException

2) Many unit test cases are not written to run in parallel. For example,
each test case changes the cwd of MiniCluster in TestLoad. There are also
many other test cases where read/write files to the same location.

3) Parallelizing test cases doesn't always reduce the execution time. In
our test suites, the distribution of test cases is not uniform. Some test
suites contain many test cases while some has only one. What I found is
that test suites with many test cases actually run slower than running
sequentially due to context switching overhead. Unfortunately, tempus-fugit
doesn't provide fine-grained control over the number of threads. It blindly
runs every test case in separate threads.

Hope this is useful.

Thanks,
Cheolsoo

On Mon, Nov 19, 2012 at 10:57 PM, lulynn_2008 <[EMAIL PROTECTED]> wrote:

> Maybe we can run some UT paralleled.
>
>
>
>
>
>
>
>
> At 2012-11-15 03:12:27,"Johnny Zhang" <[EMAIL PROTECTED]> wrote:
> >Hi, lulynn_2008:
> >I am not aware of how to shorten the time.
> >
> >Johnny
> >
> >On Tue, Nov 13, 2012 at 7:27 PM, lulynn_2008 <[EMAIL PROTECTED]> wrote:
> >
> >> Thanks.
> >> Then my environment is normal.
> >> Is there any way to shorten the time? I think maybe we can find a way to
> >> shorten the time.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> At 2012-11-14 10:37:55,"Johnny Zhang" <[EMAIL PROTECTED]> wrote:
> >> >Hi, lulynn:
> >> >Yes, whole Pig unit tests run about 8 hours.
> >> >TestEvalPipline runs about 26 mins and TestEvalPiplineLocal runs about
> 3
> >> >mins.
> >> >
> >> >Hope it is helpful,
> >> >Johnny
> >> >
> >> >
> >> >
> >> >On Tue, Nov 13, 2012 at 6:28 PM, lulynn_2008 <[EMAIL PROTECTED]>
> wrote:
> >> >
> >> >>  Hi all,
> >> >>
> >> >> The whole pig UT last for nearly 8 hours, and TestEvalPipeline2 last
> for
> >> >> 37 minutes.
> >> >>
> >> >> My questions are:
> >> >> how long pig UT will last in normal?
> >> >> Do we have jenkins for pig UT? If yes, please attach the link. Thanks
> >> >>
> >> >> Thanks
> >> >>
> >> >>
> >>
>
+
Vitalii Tymchyshyn 2012-11-21, 13:12
+
Jonathan Coveney 2012-11-20, 19:49