Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> LoadFunc and LoadMetadata

Jeff Yuan 2013-03-15, 03:37
Copy link to this message
Re: LoadFunc and LoadMetadata
getPartitionKeys should be called by default. Did you use "AS" clause
in load statement? That could add a foreach between Load and Filter,
and getPartitionKeys will only be invoked if filter is right after
load. Do an explain to check for it.


On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan <[EMAIL PROTECTED]> wrote:
> Hi all,
> For CustomLoader (a class I'm implementing) which extends LoadFunct,
> implemented LoadMetadata, the "getPartitionKeys" function is supposed
> to be called by "PartitionFilterOptimizer", right? I put some debug
> statements in "getPartitionKeys", but this function doesn't seem like
> it's ever called.
> I've read through some Pig source, optimization rules can be disabled
> by properties, but by default the "PartitionFilterOptimizer" should be
> enabled. Also, in "PartitionFilterOptimizer", I saw checks to saw some
> other checks, like the Filter operator cannot have another dependency
> other than load, which is true in my case. Anyway, can someone shed
> some light on this? Am I understanding this interface incorrectly?
> My script is very simple (line 1 is load, line 2 is filter, and line 3
> is store), so the Logical Plan should be very simple. Also, I'm
> testing this in Pig local mode, not sure if that matters.
> Greatly appreciate any hints!
Jeff Yuan 2013-03-15, 18:32
Daniel Dai 2013-03-15, 22:37