Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Hadoop Job History Loader with PIG


Copy link to this message
-
Re: Hadoop Job History Loader with PIG
Zebeljan, Nebojsa 2012-10-11, 07:29
Hi Cheolsoo,
Yes, I've registered the piggybank jar in the pig script - see script
below.

---
REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar

a = load '/some_dir/some_aggregation/_logs/history' using
HadoopJobHistoryLoader() as (j:map[], m:map[],
r:map[]);
b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user,
j#'JOBNAME' as script_name,
         (Long) j#'SUBMIT_TIME' as start, (Long) j#'FINISH_TIME' as end;
c = group b by (id, user, script_name)
d = foreach c generate group.user, group.script_name, (MAX(b.end) -
MIN(b.start)/1000;
dump d;
---

I've also downloaded the PIG from cloudera version 4.0.1 again and greped
the piggybank.jar for the "HadoopJobHistoryLoader" class - but I'm still
not founding the class?!

Greped also /usr/lib/pig/contrib/piggybank/java/piggybank.jar - same
result Š
What I'm doing wrong here?

Thanks for any help!
Nebo

Am 11.10.12 06:30 schrieb "Cheolsoo Park" unter <[EMAIL PROTECTED]>:

>Hi Nebojsa,
>
>Did you register piggybank.jar in your Pig script?
>
>REGISTER <path_to_piggibank.jar>;
>
>In CDH4.0.1, piggybank.jar can be found at
>/usr/lib/pig/contrib/piggybank/java/piggybank.jar.
>
>Thanks,
>Cheolsoo
>
>On Wed, Oct 10, 2012 at 5:23 AM, Zebeljan, Nebojsa <
>[EMAIL PROTECTED]> wrote:
>
>> Hi,
>> I'm using cdh 4.0.1 with pig-0.9.2+26.
>>
>> I'v tried to gather some information about my result files aggregated by
>> pig with the HadoopJobHistoryLoader() as described here
>>
>>http://archive.cloudera.com/cdh/3/pig/piglatin_ref1.html#Hadoop+Job+Histo
>>ry+Loader
>>
>> Running a simple pig script returns "ERROR 1070: Could not resolve
>> org.apache.pig.piggybank.storage.HadoopJobHistoryLoader using imports:
>>[,
>> org.apache.pig.builtin., org.apache.pig.impl.builtin.]"
>>
>> Having this information, I've encountered that a HadoopJobHistoryLoader
>> class in the piggybank does not exist!
>>
>> As by the API, this class should exist
>>
>>http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/Ha
>>doopJobHistoryLoader.html
>>
>> Can someone please lighten me up Š
>>
>> Thanks!
>>
>> Regards,
>> Nebo
>>
>>