Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Pig job result output and schema


Copy link to this message
-
Re: Pig job result output and schema
Jonathan Coveney 2013-03-05, 22:03
if you use the alias "@", it should properly dump etc the last alias. If
not file a JIRA.
2013/3/5 Jeff Yuan <[EMAIL PROTECTED]>

> Thanks for your suggestions, they work very well.  One follow up question:
>
> Is there a way to dynamically strip STORE and DUMP commands from a
> loaded in script? So everything works well if I pass in a script
> without any dump or store keywords. But when there is a dump, I get an
> error such as "Syntax error, unexpected symbol at or near 'dump'".
>
> I'm calling:
> Syntax error, unexpected symbol at or near 'dump
>
> pig.setBatchOn();
> pig.registerQuery(req.query);
> pig.dumpSchema(pig.getPigContext().getLastAlias());
> Iterator<Tuple> iter > pig.openIterator(pig.getPigContext().getLastAlias());
> ...
>
> Thanks,
> Jeff
>
> On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang <[EMAIL PROTECTED]>
> wrote:
> > Hi, Jeff:
> > Reply inline.
> >
> >
> > On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan <[EMAIL PROTECTED]>
> wrote:
> >
> >> I have a couple of questions regarding job result and schema. The
> >> context is that I'm trying to create a custom entry point for Pig that
> >> takes a script, executes it, and always stores the last declared
> >> alias/variable in a file. Would appreciate any insights to the 2
> >> questions I have below or any advice in general.
> >>
> >> 1. I'm looking to automatically dump or store the last variable/alias
> >> that the user has set. I know PigServer.getAliasKeySet or getAliases
> >> will return a Set or Map of the alias. But they are unordered, is
> >> there a way to get an ordered list of aliases?
> >>
> > Have you try PigServer.getPigContext().getLastAlias()) ?
> >
> >>
> >> 2. I'm interested in getting the result schema and the raw result set.
> >> Is the best way to do this just PigServer.dumpSchema(alias) to get the
> >> result schema, and PigServer.openIterator(alias) to get the resulting
> >> Tuples?
> >>
> > yes, as I know, this is a good way to do it. after you get iterator, you
> > can use below to go through each tuple
> > while(iter.hasNext()) {
> >       Tuple t = iter.next();
> > }
> >
> >>
> >> Thanks,
> >> Jeff
> >>
> >
> > Johnny
>