|
|
-
Re: Is there anything in pig that supports external client to stream out a content of alias? a bit like Hive Thrift server...Ashutosh Chauhan 2010-12-07, 18:16
I am not sure if I understood your requirements clearly, but if you
are not looking for a pure PigLatin solution and can work through Pig's java api, then you may want to look at PigServer. http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/PigServer.html Something along the following lines: PigServer pig = new PigServer(pc, true); pig.registerQuery("A = load 'mydata'; "); pig.registerQuery("B = filter A by $0 > 10;"); Iterator<Tuple> itr = pig.operIterator("B"); while(itr.hasNext()){ if ( itr.next().get(0) == 25 ) { // trigger further processing. } } Its obviously not directly useful, but conveys the general idea. Hope it helps. Ashutosh On Tue, Dec 7, 2010 at 06:40, Jae Lee <[EMAIL PROTECTED]> wrote: > Hi, > > In our application Hive is used as a database. i.e. a result set from a select query is consumed outside of hadoop cluster. > > The consumption process is not Hadoop friendly as in it is network bound not cpu/disk bound. > > I'm in a process of converting hive query into pig query to see if it reads better. > > What I'm stuck at is finding the content of a specific alias dump, from all the other stuff being logged, to be able to trigger further process. > > STREAM <alias> THROUGH <cmd> seems to be one way to trigger a process, it's just that it seems not suitable for the kind of process we are looking at, because the <cmd> gets run in hadoop cluster. > > any thought? > > J |