Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Hard-coded inline relations


+
Michael Malak 2013-01-18, 18:49
+
Alan Gates 2013-01-24, 16:15
+
Dmitriy Ryaboy 2013-01-24, 21:03
Copy link to this message
-
Re: Hard-coded inline relations
Dmitriy Ryaboy 2013-01-24, 22:20
Ok you can build it from my EB branch:

https://github.com/dvryaboy/elephant-bird/tree/add_locationtuple_loader

You will want to build the elephant-bird-pig package, and the loader is

com.twitter.elephantbird.pig.load.LocationAsTuple

Here's the javadoc:

/**
 * Parses the "location" into a tuple by splitting on a delimiter, and
returns it.
 * Handy for turning scalars into relations. For example:
 * <pre>{@code
 * languages = load 'en,fr,jp' using LocationAsTuple(',');
 * -- languages is ('en', 'fr', 'jp')
 * language_bag = foreach languages generate flatten(TOBAG(*));
 * -- language_bag is a relation with three rows, ('en'), ('fr'), ('jp')
 * }</pre>
 */
On Thu, Jan 24, 2013 at 1:03 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
>
> I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.
>
>
> On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <[EMAIL PROTECTED]> wrote:
>>
>> I agree this would be useful for debugging, but I'd go about it a
different way.  Rather than add new syntax as you propose, it seems we
could easily create an inline loader, so your script would look something
like:
>>
>> A = load '{(Hello), (World)}' using InlineLoader();
>> dump A;
>>
>> Alan.
>>
>> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>>
>> > I'm new to Pig, and it looks like there is no provision to declare
relations inline in a Pig script (without LOADing from an external file)?
>> >
>> > Based on
>> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
>> > I would have thought the following would constitute "Hello World" for
Pig:
>> >
>> > A = {('Hello'),('World')};
>> > DUMP A;
>> >
>> > But I get a syntax error.  The ability to inline relations would be
useful for debugging.  Is this limitation by design, or is it just not
implemented yet?
>> >
>>
>