Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Hard-coded inline relations


Copy link to this message
-
Re: Hard-coded inline relations
Ok you can build it from my EB branch:

https://github.com/dvryaboy/elephant-bird/tree/add_locationtuple_loader

You will want to build the elephant-bird-pig package, and the loader is

com.twitter.elephantbird.pig.load.LocationAsTuple

Here's the javadoc:

/**
 * Parses the "location" into a tuple by splitting on a delimiter, and
returns it.
 * Handy for turning scalars into relations. For example:
 * <pre>{@code
 * languages = load 'en,fr,jp' using LocationAsTuple(',');
 * -- languages is ('en', 'fr', 'jp')
 * language_bag = foreach languages generate flatten(TOBAG(*));
 * -- language_bag is a relation with three rows, ('en'), ('fr'), ('jp')
 * }</pre>
 */
On Thu, Jan 24, 2013 at 1:03 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
>
> I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.
>
>
> On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <[EMAIL PROTECTED]> wrote:
>>
>> I agree this would be useful for debugging, but I'd go about it a
different way.  Rather than add new syntax as you propose, it seems we
could easily create an inline loader, so your script would look something
like:
>>
>> A = load '{(Hello), (World)}' using InlineLoader();
>> dump A;
>>
>> Alan.
>>
>> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>>
>> > I'm new to Pig, and it looks like there is no provision to declare
relations inline in a Pig script (without LOADing from an external file)?
>> >
>> > Based on
>> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
>> > I would have thought the following would constitute "Hello World" for
Pig:
>> >
>> > A = {('Hello'),('World')};
>> > DUMP A;
>> >
>> > But I get a syntax error.  The ability to inline relations would be
useful for debugging.  Is this limitation by design, or is it just not
implemented yet?
>> >
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB