Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Pig developer meeting in February

Copy link to this message
Re: Pig developer meeting in February
What do you mean by true predicate pushdown? We hand over the full
filter expression in that method to loader.  That I guess is
sufficient info to push more processing at storage layer e.g. to do
range queries in Hbase. Pig doesn't have any more information about
filters then that to push, unless you want full logical plan.

On Wed, Jan 26, 2011 at 18:04, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> Right, we do partition filtering, but not true predicate pushdown.
> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> wrote:
>> Are you talking about LoadMetadata.setPartitionFilter?
>> PartitionFilterOptimizer will do that.
>> Daniel
>> Dmitriy Ryaboy wrote:
>>> I may be wrong but I think predicate pushdown is designed for, but not
>>> actually implemented in the current LoadPushdown interface (you can only
>>> push projections). If I am wrong, that's great.. but if not, that would be
>>> an important feature to add, as people are trying to connect Pig to
>>> "smart"
>>> storage systems like rdbmses, HBase, and Cassandra more and more.  I think
>>> we only kind of simulate this with partition keys info, which is not
>>> always
>>> sufficient
>>> D
>>> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem <[EMAIL PROTECTED]>
>>> wrote:
>>>> If making Pig Thread safe (i.e.: two threads running a different pig
>>>> script) is important then we need to change some of the APIs from static
>>>> singleton access to a dependency injection pattern.
>>>> In that case, this should probably be done before 1.0
>>>> For example: UDFContext should be passed to the UDF after construction
>>>> (similar to the SevrletContext in Servlet or the way Hadoop passes the
>>>> context to tasks)
>>>> Also a clearly separated API that does not depend on the Pig
>>>> implementation
>>>> would help.
>>>> For example UDFContext is in org.apache.pig.impl.util when it would be
>>>> better in org.apache.pig.api (Or at least an interface defining it)
>>>> Julien
>>>> On 1/24/11 10:14 AM, "Olga Natkovich" <[EMAIL PROTECTED]> wrote:
>>>> Hi Guys,
>>>> I think it is time for us to have another meeting. Yahoo would be happy
>>>> to
>>>> host if this works for everybody. How about Wednesday, 2/9 4-6 pm.
>>>> Please,
>>>> let us know if you are planning to attend and if the date/time works for
>>>> you.
>>>> Things that come to mind to discuss and as always feel free to suggest
>>>> others:
>>>> -          Error handling proposal - this might be easier to finalize
>>>> face-to-face
>>>> -          Pig 0.9 plan
>>>> -          Pig Roadmap beyond 0.9
>>>> o        What do we want to do in Pig.next?
>>>> o        Are we ready for Pig 1.0
>>>> Olga