|
|
-
Re: Hadoop Job History Loader with PIGZebeljan, Nebojsa 2012-10-11, 07:46
Hi Cheolsoo,
I've found the reason why the "HadoopJobHistoryLoader" is not available. In clouderas distro the class is excluded when building the piggybank -> ./contrib/piggybank/java/build.xml -> ./cloudera/patches/0001-CLOUDERA-BUILD.-CDHifying-Pig-0.9.1-build.patch --- <!-- JobHistoryLoader currently does not support 0.23 --> <condition property="build.classes.excludes" value="**/HadoopJobHistoryLoader.java" else=""> <equals arg1="${hadoopversion}" arg2="23"/> </condition> <condition property="test.classes.excludes" value="**/TestHadoopJobHistoryLoader.java" else=""> <equals arg1="${hadoopversion}" arg2="23"/> </condition> --- Do you know if this "exclude" is still needed for hadoop-2.x? Thanks in advance! Nebo Am 11.10.12 09:29 schrieb "Zebeljan, Nebojsa" unter <[EMAIL PROTECTED]>: >Hi Cheolsoo, >Yes, I've registered the piggybank jar in the pig script - see script >below. > >--- >REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar > >a = load '/some_dir/some_aggregation/_logs/history' using >HadoopJobHistoryLoader() as (j:map[], m:map[], >r:map[]); >b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, >j#'JOBNAME' as script_name, > (Long) j#'SUBMIT_TIME' as start, (Long) j#'FINISH_TIME' as end; >c = group b by (id, user, script_name) >d = foreach c generate group.user, group.script_name, (MAX(b.end) - >MIN(b.start)/1000; >dump d; >--- > >I've also downloaded the PIG from cloudera version 4.0.1 again and greped >the piggybank.jar for the "HadoopJobHistoryLoader" class - but I'm still >not founding the class?! > >Greped also /usr/lib/pig/contrib/piggybank/java/piggybank.jar - same >result Š > > >What I'm doing wrong here? > >Thanks for any help! >Nebo > > > >Am 11.10.12 06:30 schrieb "Cheolsoo Park" unter <[EMAIL PROTECTED]>: > >>Hi Nebojsa, >> >>Did you register piggybank.jar in your Pig script? >> >>REGISTER <path_to_piggibank.jar>; >> >>In CDH4.0.1, piggybank.jar can be found at >>/usr/lib/pig/contrib/piggybank/java/piggybank.jar. >> >>Thanks, >>Cheolsoo >> >>On Wed, Oct 10, 2012 at 5:23 AM, Zebeljan, Nebojsa < >>[EMAIL PROTECTED]> wrote: >> >>> Hi, >>> I'm using cdh 4.0.1 with pig-0.9.2+26. >>> >>> I'v tried to gather some information about my result files aggregated >>>by >>> pig with the HadoopJobHistoryLoader() as described here >>> >>>http://archive.cloudera.com/cdh/3/pig/piglatin_ref1.html#Hadoop+Job+Hist >>>o >>>ry+Loader >>> >>> Running a simple pig script returns "ERROR 1070: Could not resolve >>> org.apache.pig.piggybank.storage.HadoopJobHistoryLoader using imports: >>>[, >>> org.apache.pig.builtin., org.apache.pig.impl.builtin.]" >>> >>> Having this information, I've encountered that a HadoopJobHistoryLoader >>> class in the piggybank does not exist! >>> >>> As by the API, this class should exist >>> >>>http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/H >>>a >>>doopJobHistoryLoader.html >>> >>> Can someone please lighten me up Š >>> >>> Thanks! >>> >>> Regards, >>> Nebo >>> >>> > |