Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig job result output and schema


Copy link to this message
-
Re: Pig job result output and schema
if you use the alias "@", it should properly dump etc the last alias. If
not file a JIRA.
2013/3/5 Jeff Yuan <[EMAIL PROTECTED]>

> Thanks for your suggestions, they work very well.  One follow up question:
>
> Is there a way to dynamically strip STORE and DUMP commands from a
> loaded in script? So everything works well if I pass in a script
> without any dump or store keywords. But when there is a dump, I get an
> error such as "Syntax error, unexpected symbol at or near 'dump'".
>
> I'm calling:
> Syntax error, unexpected symbol at or near 'dump
>
> pig.setBatchOn();
> pig.registerQuery(req.query);
> pig.dumpSchema(pig.getPigContext().getLastAlias());
> Iterator<Tuple> iter > pig.openIterator(pig.getPigContext().getLastAlias());
> ...
>
> Thanks,
> Jeff
>
> On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang <[EMAIL PROTECTED]>
> wrote:
> > Hi, Jeff:
> > Reply inline.
> >
> >
> > On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan <[EMAIL PROTECTED]>
> wrote:
> >
> >> I have a couple of questions regarding job result and schema. The
> >> context is that I'm trying to create a custom entry point for Pig that
> >> takes a script, executes it, and always stores the last declared
> >> alias/variable in a file. Would appreciate any insights to the 2
> >> questions I have below or any advice in general.
> >>
> >> 1. I'm looking to automatically dump or store the last variable/alias
> >> that the user has set. I know PigServer.getAliasKeySet or getAliases
> >> will return a Set or Map of the alias. But they are unordered, is
> >> there a way to get an ordered list of aliases?
> >>
> > Have you try PigServer.getPigContext().getLastAlias()) ?
> >
> >>
> >> 2. I'm interested in getting the result schema and the raw result set.
> >> Is the best way to do this just PigServer.dumpSchema(alias) to get the
> >> result schema, and PigServer.openIterator(alias) to get the resulting
> >> Tuples?
> >>
> > yes, as I know, this is a good way to do it. after you get iterator, you
> > can use below to go through each tuple
> > while(iter.hasNext()) {
> >       Tuple t = iter.next();
> > }
> >
> >>
> >> Thanks,
> >> Jeff
> >>
> >
> > Johnny
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB