Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> LoadFunc and LoadMetadata


Copy link to this message
-
Re: LoadFunc and LoadMetadata
getPartitionKeys should be called by default. Did you use "AS" clause
in load statement? That could add a foreach between Load and Filter,
and getPartitionKeys will only be invoked if filter is right after
load. Do an explain to check for it.

Thanks,
Daniel

On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> For CustomLoader (a class I'm implementing) which extends LoadFunct,
> implemented LoadMetadata, the "getPartitionKeys" function is supposed
> to be called by "PartitionFilterOptimizer", right? I put some debug
> statements in "getPartitionKeys", but this function doesn't seem like
> it's ever called.
>
> I've read through some Pig source, optimization rules can be disabled
> by properties, but by default the "PartitionFilterOptimizer" should be
> enabled. Also, in "PartitionFilterOptimizer", I saw checks to saw some
> other checks, like the Filter operator cannot have another dependency
> other than load, which is true in my case. Anyway, can someone shed
> some light on this? Am I understanding this interface incorrectly?
>
> My script is very simple (line 1 is load, line 2 is filter, and line 3
> is store), so the Logical Plan should be very simple. Also, I'm
> testing this in Pig local mode, not sure if that matters.
>
> Greatly appreciate any hints!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB