|
Wenhao Xu
2010-08-05, 03:56
Harsh J
2010-08-05, 04:01
Harsh J
2010-08-05, 04:02
Wenhao Xu
2010-08-05, 04:13
Dmitriy Lyubimov
2010-08-05, 04:19
Dmitriy Lyubimov
2010-08-05, 04:21
Jeff Zhang
2010-08-05, 04:25
Vincent Barat
2010-08-05, 06:08
Gerrit van Vuuren
2010-08-05, 06:27
Dmitriy Lyubimov
2010-08-05, 17:14
|
-
Call Pig from JavaWenhao Xu 2010-08-05, 03:56
Hi all,
I am new to pig. I am wondering is there any recommended way to call Pig code from Java? Is there any Java interface which can be called directly from Java and makes them work smoothly? It seems each keyword (filter, group, cogrape, generate) and data types in Pig can have a counterpart in Java by using Class, interface and data type. Is these Java interface available to Java programmers to use? If not, why not? Thanks very much for help! regards, Wenhao -- ~_~
-
Re: Call Pig from JavaHarsh J 2010-08-05, 04:01
You need to use the class PigServer.
PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local mode pigServer.registerQuery("A = LOAD ..."); (...) // Your statements here. pigServer.store("A", "filename"); On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > Hi all, > I am new to pig. I am wondering is there any recommended way to call Pig > code from Java? > Is there any Java interface which can be called directly from Java and > makes them work smoothly? It seems each keyword (filter, group, cogrape, > generate) and data types in Pig can have a counterpart in Java by using > Class, interface and data type. Is these Java interface available to Java > programmers to use? If not, why not? > Thanks very much for help! > > regards, > Wenhao > > -- > ~_~ > -- Harsh J www.harshj.com
-
Re: Call Pig from JavaHarsh J 2010-08-05, 04:02
Sorry, forgot the API link:
http://hadoop.apache.org/pig/docs/r0.7.0/api/org/apache/pig/PigServer.html On Thu, Aug 5, 2010 at 9:31 AM, Harsh J <[EMAIL PROTECTED]> wrote: > You need to use the class PigServer. > > PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local mode > pigServer.registerQuery("A = LOAD ..."); > (...) // Your statements here. > pigServer.store("A", "filename"); > > On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: >> Hi all, >> I am new to pig. I am wondering is there any recommended way to call Pig >> code from Java? >> Is there any Java interface which can be called directly from Java and >> makes them work smoothly? It seems each keyword (filter, group, cogrape, >> generate) and data types in Pig can have a counterpart in Java by using >> Class, interface and data type. Is these Java interface available to Java >> programmers to use? If not, why not? >> Thanks very much for help! >> >> regards, >> Wenhao >> >> -- >> ~_~ >> > > > > -- > Harsh J > www.harshj.com > -- Harsh J www.harshj.com
-
Re: Call Pig from JavaWenhao Xu 2010-08-05, 04:13
btw, I am considering using it to speedup (parallel) online queries over
large dataset. Is pig suitable for this, or just suitable for offline large data analysis? Will it be a better choice than distributed(parallel) database in terms of scalability and latency? I really like the pig's programming interface. So I want to try to use it instead of using parallel database. Thanks! cheers, W. On Wed, Aug 4, 2010 at 9:08 PM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > Thanks! > Can PigServer handle concurrent requests? Because the store is a > synchronous interface, is there any asynchronous one? > > cheers, > W. > > > On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> You need to use the class PigServer. >> >> PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local >> mode >> pigServer.registerQuery("A = LOAD ..."); >> (...) // Your statements here. >> pigServer.store("A", "filename"); >> >> On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: >> > Hi all, >> > I am new to pig. I am wondering is there any recommended way to call >> Pig >> > code from Java? >> > Is there any Java interface which can be called directly from Java and >> > makes them work smoothly? It seems each keyword (filter, group, cogrape, >> > generate) and data types in Pig can have a counterpart in Java by using >> > Class, interface and data type. Is these Java interface available to >> Java >> > programmers to use? If not, why not? >> > Thanks very much for help! >> > >> > regards, >> > Wenhao >> > >> > -- >> > ~_~ >> > >> >> >> >> -- >> Harsh J >> www.harshj.com >> > > > > -- > ~_~ > -- ~_~
-
Re: Call Pig from JavaDmitriy Lyubimov 2010-08-05, 04:19
In my personal and very not so long lived opinion PigServer is not very
useful as it doesn't run directly pig scripts. I actually integrated Grunt setup in a spring bean and been able to run pig scripts that way initialized as a resource. it also takes care of project classpath on the hadoop side so no registration of any jars is necessary, anything in our project classpath (as built by maven) is automatically added to the backend classpaths. Also feeding in script parameters thru spring injections is also pretty consistent with our spring use and useful. This requires some work (couple of days) to dig up various grant parameters and PigContext parameters but i think it pays off with convenience of using regular grunt script and ease of UDF access. -Dmitriy On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > You need to use the class PigServer. > > PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local > mode > pigServer.registerQuery("A = LOAD ..."); > (...) // Your statements here. > pigServer.store("A", "filename"); > > On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > > Hi all, > > I am new to pig. I am wondering is there any recommended way to call > Pig > > code from Java? > > Is there any Java interface which can be called directly from Java and > > makes them work smoothly? It seems each keyword (filter, group, cogrape, > > generate) and data types in Pig can have a counterpart in Java by using > > Class, interface and data type. Is these Java interface available to Java > > programmers to use? If not, why not? > > Thanks very much for help! > > > > regards, > > Wenhao > > > > -- > > ~_~ > > > > > > -- > Harsh J > www.harshj.com >
-
Re: Call Pig from JavaDmitriy Lyubimov 2010-08-05, 04:21
No, pig (or any MR stuff) is not really useful for real time queries. Not
unless you can wait at least a couple of minutes. It would seem you need to look towards HBase, Cassandra and the likes going under 'NoSQL' umbrella. On Wed, Aug 4, 2010 at 9:13 PM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > btw, I am considering using it to speedup (parallel) online queries over > large dataset. Is pig suitable for this, or just suitable for offline large > data analysis? Will it be a better choice than distributed(parallel) > database in terms of scalability and latency? > > I really like the pig's programming interface. So I want to try to use it > instead of using parallel database. > > Thanks! > > cheers, > W. > > On Wed, Aug 4, 2010 at 9:08 PM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > > > Thanks! > > Can PigServer handle concurrent requests? Because the store is a > > synchronous interface, is there any asynchronous one? > > > > cheers, > > W. > > > > > > On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > > > >> You need to use the class PigServer. > >> > >> PigServer pigServer = new PigServer("mapreduce"); // Or "local" for > local > >> mode > >> pigServer.registerQuery("A = LOAD ..."); > >> (...) // Your statements here. > >> pigServer.store("A", "filename"); > >> > >> On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> > wrote: > >> > Hi all, > >> > I am new to pig. I am wondering is there any recommended way to call > >> Pig > >> > code from Java? > >> > Is there any Java interface which can be called directly from Java > and > >> > makes them work smoothly? It seems each keyword (filter, group, > cogrape, > >> > generate) and data types in Pig can have a counterpart in Java by > using > >> > Class, interface and data type. Is these Java interface available to > >> Java > >> > programmers to use? If not, why not? > >> > Thanks very much for help! > >> > > >> > regards, > >> > Wenhao > >> > > >> > -- > >> > ~_~ > >> > > >> > >> > >> > >> -- > >> Harsh J > >> www.harshj.com > >> > > > > > > > > -- > > ~_~ > > > > > > -- > ~_~ >
-
Re: Call Pig from JavaJeff Zhang 2010-08-05, 04:25
Currently, PigServer is not thread-safe. You can try patches in
http://issues.apache.org/jira/browse/PIG-240 On Thu, Aug 5, 2010 at 12:08 PM, Wenhao Xu <[EMAIL PROTECTED]> wrote: > Thanks! > Can PigServer handle concurrent requests? Because the store is a > synchronous interface, is there any asynchronous one? > > cheers, > W. > > On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > > > You need to use the class PigServer. > > > > PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local > > mode > > pigServer.registerQuery("A = LOAD ..."); > > (...) // Your statements here. > > pigServer.store("A", "filename"); > > > > On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> > wrote: > > > Hi all, > > > I am new to pig. I am wondering is there any recommended way to call > > Pig > > > code from Java? > > > Is there any Java interface which can be called directly from Java > and > > > makes them work smoothly? It seems each keyword (filter, group, > cogrape, > > > generate) and data types in Pig can have a counterpart in Java by using > > > Class, interface and data type. Is these Java interface available to > Java > > > programmers to use? If not, why not? > > > Thanks very much for help! > > > > > > regards, > > > Wenhao > > > > > > -- > > > ~_~ > > > > > > > > > > > -- > > Harsh J > > www.harshj.com > > > > > > -- > ~_~ > -- Best Regards Jeff Zhang
-
Re: Call Pig from JavaVincent Barat 2010-08-05, 06:08
No. PigServer is not reentrant at this time, afaik, and even if you create several pigserver objects you will run into trouble, as there is a small set of global data shared between them. It may work for a time, but it will fail at a point. The only way is to create different processes to handle your requests.
Wenhao Xu <[EMAIL PROTECTED]> a écrit : >Thanks! >Can PigServer handle concurrent requests? Because the store is a >synchronous interface, is there any asynchronous one? > >cheers, >W. > >On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> You need to use the class PigServer. >> >> PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local >> mode >> pigServer.registerQuery("A = LOAD ..."); >> (...) // Your statements here. >> pigServer.store("A", "filename"); >> >> On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: >> > Hi all, >> > I am new to pig. I am wondering is there any recommended way to call >> Pig >> > code from Java? >> > Is there any Java interface which can be called directly from Java and >> > makes them work smoothly? It seems each keyword (filter, group, cogrape, >> > generate) and data types in Pig can have a counterpart in Java by using >> > Class, interface and data type. Is these Java interface available to Java >> > programmers to use? If not, why not? >> > Thanks very much for help! >> > >> > regards, >> > Wenhao >> > >> > -- >> > ~_~ >> > >> >> >> >> -- >> Harsh J >> www.harshj.com >> > > > >-- >~_~
-
Re: Call Pig from JavaGerrit van Vuuren 2010-08-05, 06:27
Yep I can confirm that if you call it enough times within the same java process you will run out of memory eventually. I've tried this before, monitored this with jconsole and saw the memory gradually increasing over 50 or so iterations, each iteration also created its own set of threads that never died but this might be in the hadoop client itself.
I even tried using a whole different set of classloaders to try and unload classes after each call but this did not work either ----- Original Message ----- From: Vincent Barat <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>; [EMAIL PROTECTED] <[EMAIL PROTECTED]> Sent: Thu Aug 05 07:08:06 2010 Subject: Re: Call Pig from Java No. PigServer is not reentrant at this time, afaik, and even if you create several pigserver objects you will run into trouble, as there is a small set of global data shared between them. It may work for a time, but it will fail at a point. The only way is to create different processes to handle your requests. Wenhao Xu <[EMAIL PROTECTED]> a écrit : >Thanks! >Can PigServer handle concurrent requests? Because the store is a >synchronous interface, is there any asynchronous one? > >cheers, >W. > >On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> You need to use the class PigServer. >> >> PigServer pigServer = new PigServer("mapreduce"); // Or "local" for local >> mode >> pigServer.registerQuery("A = LOAD ..."); >> (...) // Your statements here. >> pigServer.store("A", "filename"); >> >> On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> wrote: >> > Hi all, >> > I am new to pig. I am wondering is there any recommended way to call >> Pig >> > code from Java? >> > Is there any Java interface which can be called directly from Java and >> > makes them work smoothly? It seems each keyword (filter, group, cogrape, >> > generate) and data types in Pig can have a counterpart in Java by using >> > Class, interface and data type. Is these Java interface available to Java >> > programmers to use? If not, why not? >> > Thanks very much for help! >> > >> > regards, >> > Wenhao >> > >> > -- >> > ~_~ >> > >> >> >> >> -- >> Harsh J >> www.harshj.com >> > > > >-- >~_~
-
Re: Call Pig from JavaDmitriy Lyubimov 2010-08-05, 17:14
Gerrit,
what you are saying is much more serious than just being reentrant. Are you saying you have not been able to run same grunt or pig server instance thru a bunch of scripts and OOM happened eventually? can you share which version of pig that was? I so far haven't actually seen that issue but then i run only 4 pig jobs a day and the process i ran has been up for perhaps couple of weeks only Thanks. -Dmitriy On Wed, Aug 4, 2010 at 11:27 PM, Gerrit van Vuuren < [EMAIL PROTECTED]> wrote: > Yep I can confirm that if you call it enough times within the same java > process you will run out of memory eventually. I've tried this before, > monitored this with jconsole and saw the memory gradually increasing over 50 > or so iterations, each iteration also created its own set of threads that > never died but this might be in the hadoop client itself. > I even tried using a whole different set of classloaders to try and unload > classes after each call but this did not work either > > > ----- Original Message ----- > From: Vincent Barat <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>; > [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Sent: Thu Aug 05 07:08:06 2010 > Subject: Re: Call Pig from Java > > No. PigServer is not reentrant at this time, afaik, and even if you create > several pigserver objects you will run into trouble, as there is a small set > of global data shared between them. It may work for a time, but it will fail > at a point. The only way is to create different processes to handle your > requests. > > Wenhao Xu <[EMAIL PROTECTED]> a écrit : > > >Thanks! > >Can PigServer handle concurrent requests? Because the store is a > >synchronous interface, is there any asynchronous one? > > > >cheers, > >W. > > > >On Wed, Aug 4, 2010 at 9:01 PM, Harsh J <[EMAIL PROTECTED]> wrote: > > > >> You need to use the class PigServer. > >> > >> PigServer pigServer = new PigServer("mapreduce"); // Or "local" for > local > >> mode > >> pigServer.registerQuery("A = LOAD ..."); > >> (...) // Your statements here. > >> pigServer.store("A", "filename"); > >> > >> On Thu, Aug 5, 2010 at 9:26 AM, Wenhao Xu <[EMAIL PROTECTED]> > wrote: > >> > Hi all, > >> > I am new to pig. I am wondering is there any recommended way to call > >> Pig > >> > code from Java? > >> > Is there any Java interface which can be called directly from Java > and > >> > makes them work smoothly? It seems each keyword (filter, group, > cogrape, > >> > generate) and data types in Pig can have a counterpart in Java by > using > >> > Class, interface and data type. Is these Java interface available to > Java > >> > programmers to use? If not, why not? > >> > Thanks very much for help! > >> > > >> > regards, > >> > Wenhao > >> > > >> > -- > >> > ~_~ > >> > > >> > >> > >> > >> -- > >> Harsh J > >> www.harshj.com > >> > > > > > > > >-- > >~_~ > |