Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Pig developer meeting in February


Copy link to this message
-
Re: Pig developer meeting in February
> Are you saying that as long as one claims every column as a partition, all filters will be pushed
> down?

Exactly. Though javadoc are heavily worded for partition pruning,
since that was the primary use case at that time for predicate
pushdown.  But you will get all the filter expressions if you claim
all the columns are partition columns. Partition columns have no
special semantics in Pig apart then this.

> Will the filters also be applied to the data the loader returns, even if the loader accepts the
> expression?

I think filter will be deleted from logical plan if it is pushed up.
So, it wont be applied in pipeline later on. Daniel can confirm if
thats the case with new logical plan or not?

Ashutosh

On Thu, Jan 27, 2011 at 17:21, Julien Le Dem <[EMAIL PROTECTED]> wrote:
> Me too.
> Julien
>
>
> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote:
>
> Ok yeah I'll come :).
>
>
>
> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote:
>
>> While there is a lively discussion on this thread, I have not actually
>> gotten any responses to having the meeting with exception of 1 person :).
>>
>> Please, let me know by the end of the week if you are planning to attend.
>> If we don't get at least a few more responses I suggest we postpone the
>> meeting.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
>> Sent: Wednesday, January 26, 2011 6:04 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: Pig developer meeting in February
>>
>> Right, we do partition filtering, but not true predicate pushdown.
>>
>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]>
>> wrote:
>>
>> > Are you talking about LoadMetadata.setPartitionFilter?
>> > PartitionFilterOptimizer will do that.
>> >
>> > Daniel
>> >
>> >
>> > Dmitriy Ryaboy wrote:
>> >
>> >> I may be wrong but I think predicate pushdown is designed for, but not
>> >> actually implemented in the current LoadPushdown interface (you can only
>> >> push projections). If I am wrong, that's great.. but if not, that would
>> be
>> >> an important feature to add, as people are trying to connect Pig to
>> >> "smart"
>> >> storage systems like rdbmses, HBase, and Cassandra more and more.  I
>> think
>> >> we only kind of simulate this with partition keys info, which is not
>> >> always
>> >> sufficient
>> >>
>> >> D
>> >>
>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem <[EMAIL PROTECTED]>
>> >> wrote:
>> >>
>> >>
>> >>
>> >>> If making Pig Thread safe (i.e.: two threads running a different pig
>> >>> script) is important then we need to change some of the APIs from
>> static
>> >>> singleton access to a dependency injection pattern.
>> >>> In that case, this should probably be done before 1.0
>> >>> For example: UDFContext should be passed to the UDF after construction
>> >>> (similar to the SevrletContext in Servlet or the way Hadoop passes the
>> >>> context to tasks)
>> >>> Also a clearly separated API that does not depend on the Pig
>> >>> implementation
>> >>> would help.
>> >>> For example UDFContext is in org.apache.pig.impl.util when it would be
>> >>> better in org.apache.pig.api (Or at least an interface defining it)
>> >>>
>> >>> Julien
>> >>>
>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <[EMAIL PROTECTED]> wrote:
>> >>>
>> >>> Hi Guys,
>> >>>
>> >>> I think it is time for us to have another meeting. Yahoo would be happy
>> >>> to
>> >>> host if this works for everybody. How about Wednesday, 2/9 4-6 pm.
>> >>> Please,
>> >>> let us know if you are planning to attend and if the date/time works
>> for
>> >>> you.
>> >>>
>> >>> Things that come to mind to discuss and as always feel free to suggest
>> >>> others:
>> >>>
>> >>> -          Error handling proposal - this might be easier to finalize
>> >>> face-to-face
>> >>> -          Pig 0.9 plan
>> >>> -          Pig Roadmap beyond 0.9
>> >>> o        What do we want to do in Pig.next?
>> >>> o        Are we ready for Pig 1.0
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB