Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> LoadFunc and LoadMetadata


+
Jeff Yuan 2013-03-15, 03:37
Copy link to this message
-
Re: LoadFunc and LoadMetadata
getPartitionKeys should be called by default. Did you use "AS" clause
in load statement? That could add a foreach between Load and Filter,
and getPartitionKeys will only be invoked if filter is right after
load. Do an explain to check for it.

Thanks,
Daniel

On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> For CustomLoader (a class I'm implementing) which extends LoadFunct,
> implemented LoadMetadata, the "getPartitionKeys" function is supposed
> to be called by "PartitionFilterOptimizer", right? I put some debug
> statements in "getPartitionKeys", but this function doesn't seem like
> it's ever called.
>
> I've read through some Pig source, optimization rules can be disabled
> by properties, but by default the "PartitionFilterOptimizer" should be
> enabled. Also, in "PartitionFilterOptimizer", I saw checks to saw some
> other checks, like the Filter operator cannot have another dependency
> other than load, which is true in my case. Anyway, can someone shed
> some light on this? Am I understanding this interface incorrectly?
>
> My script is very simple (line 1 is load, line 2 is filter, and line 3
> is store), so the Logical Plan should be very simple. Also, I'm
> testing this in Pig local mode, not sure if that matters.
>
> Greatly appreciate any hints!
+
Jeff Yuan 2013-03-15, 18:32
+
Daniel Dai 2013-03-15, 22:37