Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Magic numbers in my pig scripts


Copy link to this message
-
Re: Magic numbers in my pig scripts
Support for functions as part of the turing complete pig effort should help (it is in early design stages)-
http://wiki.apache.org/pig/TuringCompletePig

-Thejas
On 9/29/10 3:32 PM, "Eric Wadsworth" <[EMAIL PROTECTED]> wrote:

Piggers,

Parameter substitution isn't really what I'm needing. After some
discussion with my co-workers, it looks like the best feature would
really be sort of a pre-processor. Basically, insert a line in your pig
script that would "include" another pig script, right there. Then that
other pig script could contain defines, code, whatever. This would allow
us to build a hierarchy of scripts, where we could tweak some defines at
the top level, and the results would be consumed by the lower levels.

--- Eric Wadsworth

On 09/29/2010 11:15 AM, Aniket Mokashi wrote:
> http://wiki.apache.org/pig/ParameterSubstitution
> http://hadoop.apache.org/pig/docs/r0.3.0/piglatin.html
>
> Also, Pig 0.8 can have RECORD_TYPE_ALPHA take runtime values (alias like
> filtered_stuff_threshold).
> https://issues.apache.org/jira/browse/PIG-1434
>
> Thanks,
> Aniket
>
> -----Original Message-----
> From: Saurav Datta [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, September 29, 2010 1:06 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Magic numbers in my pig scripts
>
> Hi Eric,
>
> As I understand, you would like to define the value of the filter at
> run time, and this value would be taken from a file.
> Am I correct ?
>
> Regards,
> Saurav
>
> On Sep 29, 2010, at 10:00 AM, Eric Wadsworth wrote:
>
>
>> Hi folks!
>>
>> I'm brand new to this list, so apologies if this is an inappropriate
>> newbie question, or is otherwise incorrect, but here goes.
>>
>> I'm working with a bunch of pig scripts, and we're adding new ones
>> almost daily. They are getting more and more complex. The problem is
>> exacerbated by the proliferation of magic numbers throughout them.
>> As a software engineer, these are driving me nuts! The code is quite
>> brittle. There seems to be no way to centralize logic or even values.
>>
>> For a simple example:
>> filtered_stuff = FILTER stuff by record_type == 23;
>>
>> I'd prefer:
>> filtered_stuff = FILTER stuff by record_type == RECORD_TYPE_ALPHA;
>>
>> Where RECORD_TYPE_ALPHA is defined in some other file that the pig
>> script consumes.
>>
>> Sounds rather like the old C-style header files would be in order...
>>
>> Am I missing something obvious here? How do you guys handle this
>> problem? (We're using pig 6 and are just starting to transition to
>> pig 7.)
>>
>> Thanks! --- Eric Wadsworth
>>
>
>
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB