Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig job result output and schema

Copy link to this message
Re: Pig job result output and schema
Thanks for your suggestions, they work very well.  One follow up question:

Is there a way to dynamically strip STORE and DUMP commands from a
loaded in script? So everything works well if I pass in a script
without any dump or store keywords. But when there is a dump, I get an
error such as "Syntax error, unexpected symbol at or near 'dump'".

I'm calling:
Syntax error, unexpected symbol at or near 'dump

Iterator<Tuple> iter = pig.openIterator(pig.getPigContext().getLastAlias());


On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang <[EMAIL PROTECTED]> wrote:
> Hi, Jeff:
> Reply inline.
> On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan <[EMAIL PROTECTED]> wrote:
>> I have a couple of questions regarding job result and schema. The
>> context is that I'm trying to create a custom entry point for Pig that
>> takes a script, executes it, and always stores the last declared
>> alias/variable in a file. Would appreciate any insights to the 2
>> questions I have below or any advice in general.
>> 1. I'm looking to automatically dump or store the last variable/alias
>> that the user has set. I know PigServer.getAliasKeySet or getAliases
>> will return a Set or Map of the alias. But they are unordered, is
>> there a way to get an ordered list of aliases?
> Have you try PigServer.getPigContext().getLastAlias()) ?
>> 2. I'm interested in getting the result schema and the raw result set.
>> Is the best way to do this just PigServer.dumpSchema(alias) to get the
>> result schema, and PigServer.openIterator(alias) to get the resulting
>> Tuples?
> yes, as I know, this is a good way to do it. after you get iterator, you
> can use below to go through each tuple
> while(iter.hasNext()) {
>       Tuple t = iter.next();
> }
>> Thanks,
>> Jeff
> Johnny