Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Are there any explanations of the implementation of illustrate?


Copy link to this message
-
Re: Are there any explanations of the implementation of illustrate?
Doug,

Would definitely love for you guys to merge in your updates.

Thejas,

Thanks for providing more information. Good to know where to look. Between
the paper and some info on where to look, I have some solid leads!

Jon

2012/7/5 Thejas Nair <[EMAIL PROTECTED]>

> Earlier implementation of illustrate used the pig local mode execution
> engine (which corresponds to the time when paper was published) .
>
> As part of illustrate reword in PIG-1712, Yan replaced the default Map and
> Reduce context objects with a IllustratorContext. Look for
> IllustratorContext and LocalMapReduceSimulator in
> https://issues.apache.org/**jira/secure/attachment/**
> 12459267/illustrator_2.patch<https://issues.apache.org/jira/secure/attachment/12459267/illustrator_2.patch>
> The context objects write their output and read input from memory.
>
> We can consider using this for pig local mode as well, by replacing the in
> memory list with something that can spill to disk.
>
> -Thejas
>
>
>
> On 7/3/12 6:34 PM, Jonathan Coveney wrote:
>
>> Jie, that's perfect, thanks. This doc, specifically:
>> http://i.stanford.edu/~olston/**publications/sigmod09.pdf<http://i.stanford.edu/~olston/publications/sigmod09.pdf>is exactly the
>> detailed explanation I was looking for.
>>
>> 2012/7/3 Jie Li <[EMAIL PROTECTED]>
>>
>>  Some document here: http://wiki.apache.org/pig/**PigIllustrate<http://wiki.apache.org/pig/PigIllustrate>
>>>
>>> I agree that more tests are needed for illustrate, otherwise it can be
>>> easily broken without notice.
>>>
>>> Jie
>>>
>>> On Tue, Jul 3, 2012 at 12:45 PM, Jon Coveney <[EMAIL PROTECTED]> wrote:
>>>
>>>> I was curious at a level slightly higher than "dig through the code" how
>>>>
>>> illustrate is so fast, and how it deals with joins effectively. Are there
>>> any resources on this (or does anyone at Hortonworks want to write a tech
>>> oriented blog post? :)
>>>
>>>>
>>>>
>>>
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB