Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Magic numbers in my pig scripts


Copy link to this message
-
RE: Magic numbers in my pig scripts
http://wiki.apache.org/pig/ParameterSubstitution
http://hadoop.apache.org/pig/docs/r0.3.0/piglatin.html

Also, Pig 0.8 can have RECORD_TYPE_ALPHA take runtime values (alias like
filtered_stuff_threshold).
https://issues.apache.org/jira/browse/PIG-1434

Thanks,
Aniket

-----Original Message-----
From: Saurav Datta [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, September 29, 2010 1:06 PM
To: [EMAIL PROTECTED]
Subject: Re: Magic numbers in my pig scripts

Hi Eric,

As I understand, you would like to define the value of the filter at  
run time, and this value would be taken from a file.
Am I correct ?

Regards,
Saurav

On Sep 29, 2010, at 10:00 AM, Eric Wadsworth wrote:

> Hi folks!
>
> I'm brand new to this list, so apologies if this is an inappropriate  
> newbie question, or is otherwise incorrect, but here goes.
>
> I'm working with a bunch of pig scripts, and we're adding new ones  
> almost daily. They are getting more and more complex. The problem is  
> exacerbated by the proliferation of magic numbers throughout them.  
> As a software engineer, these are driving me nuts! The code is quite  
> brittle. There seems to be no way to centralize logic or even values.
>
> For a simple example:
> filtered_stuff = FILTER stuff by record_type == 23;
>
> I'd prefer:
> filtered_stuff = FILTER stuff by record_type == RECORD_TYPE_ALPHA;
>
> Where RECORD_TYPE_ALPHA is defined in some other file that the pig  
> script consumes.
>
> Sounds rather like the old C-style header files would be in order...
>
> Am I missing something obvious here? How do you guys handle this  
> problem? (We're using pig 6 and are just starting to transition to  
> pig 7.)
>
> Thanks! --- Eric Wadsworth