Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Run a job async


+
Prashant Kommireddi 2013-01-24, 00:48
+
Jonathan Coveney 2013-01-24, 01:44
+
Prashant Kommireddi 2013-01-24, 02:04
+
Jonathan Coveney 2013-01-24, 05:09
+
Prashant Kommireddi 2013-01-24, 05:22
+
Alan Gates 2013-01-24, 16:37
+
Prashant Kommireddi 2013-01-24, 17:42
+
Alan Gates 2013-01-24, 17:46
+
Prashant Kommireddi 2013-02-06, 00:30
+
Jonathan Coveney 2013-01-24, 06:56
+
Praveen M 2013-01-24, 15:02
+
Ramakrishna Nalam 2013-01-25, 03:57
+
Jonathan Coveney 2013-01-25, 04:39
+
Ramakrishna Nalam 2013-01-25, 07:18
+
Cheolsoo Park 2013-01-25, 17:08
+
Jonathan Coveney 2013-01-25, 17:37
+
Rohini Palaniswamy 2013-01-26, 00:23
Copy link to this message
-
Re: Run a job async
Thank you for the suggestions. I will file a jira and add our discussion
there.
On Fri, Jan 25, 2013 at 4:23 PM, Rohini Palaniswamy <[EMAIL PROTECTED]
> wrote:

> Jon,
>   Those are good areas to check. Few things I have seen regarding those are
>
>  1) JythonScriptEngine -PythonInterpreter is static and is not suitable for
> multiple runs if the script names are same (hit this issue in PIG-2433 unit
> tests).
>  2) QueryParserDriver - There is a static cache with macro name to macro
> file mapping. So same macro names with different file locations will cause
> problems.
>  3) FileLocalizer.relativeRoot - If single cluster no issues. Just need to
> reinitialize if supporting Multiple clusters.
>
> Regards,
> Rohini
>
>
> On Fri, Jan 25, 2013 at 9:37 AM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
>
> > user to bcc, +dev
> >
> > Cheolsoo,
> >
> > Can you make a JIRA for this? I can imagine a slightly heavier test
> suite,
> > but I like where you started. If it's not far off, then I think it'll be
> a
> > win to make it thread safe. But we need to make sure to test the most
> > advanced features...UDF's (esp the same name but different udf in
> different
> > invocations), scripting UDFs (same thing), and so on.
> >
> >
> > 2013/1/25 Cheolsoo Park <[EMAIL PROTECTED]>
> >
> > > >> if you have multiple threads that run a query via PigServer, there
> is
> > a
> > > great chance of the internals clashing because of the use of static
> > > variable within Pig.
> > >
> > > Recently, I spent some time on this, and what I found is that the Pig
> > > front-end is quite thread-safe. Here is how I tested it:
> > >
> > > 1) Wrote a PigUnit test that runs in MR mode.
> > > 2) Executed test cases concurrently in 4 threads using a JUnit
> extension
> > > called temps-fugit:
> > > http://tempusfugitlibrary.org/documentation/junit/parallel/
> > >
> > > After fixing PIG-3096, I was able to successfully run Pig queries in
> > > parallel. It's important to note that only the front-end needs to be
> > > thread-safe since that's what is executed in parallel.
> > >
> > > I arbitrarily selected queries from e2e test cases, so they are
> probably
> > > not complex enough to mimic real-world examples. Nevertheless, my test
> > > program ran without a problem for few days. I couldn't continue my
> > > experiment because I was pulled out into something else. However, I
> think
> > > that making the front-end thread-safe is an achievable goal.
> > >
> > > Thanks,
> > > Cheolsoo
> > >
> > >
> > >
> > > On Thu, Jan 24, 2013 at 11:18 PM, Ramakrishna Nalam
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > That clarifies it for me, thanks a lot.
> > > >
> > > > Regards,
> > > > Rama.
> > > >
> > > >
> > > > On Fri, Jan 25, 2013 at 10:09 AM, Jonathan Coveney <
> [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > Well, when I say that Pig is not multi-threaded, what I mean is
> that
> > if
> > > > you
> > > > > have multiple threads that run a query via PigServer, there is a
> > great
> > > > > chance of the internals clashing because of the use of static
> > variables
> > > > > within Pig. Pig itself, when running a single query, is
> > multi-threaded.
> > > > > It's just not "multi-threaded" in the sense that multiple instances
> > can
> > > > > safely be run in the same JVM.
> > > > >
> > > > >
> > > > > 2013/1/24 Ramakrishna Nalam <[EMAIL PROTECTED]>
> > > > >
> > > > > > Hi Jonathan,
> > > > > >
> > > > > > Pardon if it's a naive question, but Interesting that you say Pig
> > is
> > > > not
> > > > > > multithreaded.
> > > > > > We're using Pig 0.10.0, and looking at the code, it seems to do
> the
> > > > right
> > > > > > things to handle multi threaded requests (ThreadLocal for
> > ScriptState
> > > > for
> > > > > > eg).
> > > > > >
> > > > > > Would be great if you can point out to the kind of issues there
> > could
> > > > be.
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Rama.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 24, 2013 at 8:32 PM, Praveen M <
+
Bill Graham 2013-01-24, 01:35
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB