Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Better formated.. Pig udf help


+
jamal sasha 2012-10-25, 21:46
+
pablomar 2012-10-25, 22:45
+
jamal sasha 2012-10-26, 17:24
+
Bill Graham 2012-10-25, 23:57
+
jamal sasha 2012-10-26, 17:35
+
Prashant Kommireddi 2012-10-26, 17:43
Copy link to this message
-
Re: Better formated.. Pig udf help
Hi
    So I followed the instructions.
Echo $PIG_CLASSPATH
Points to hadoop conf
On pig -f time.pig
I still get same error? ?
Problem resolving class version number
So this is what I did
1 write a program in Java as my udf and export it as udf.jar
Wrote a simple pig script and register that udf

Now trying to run it thru pig -f time.pig
On Friday, October 26, 2012, Prashant Kommireddi <[EMAIL PROTECTED]>
wrote:
> Do you see a "conf" dir within /path/Hadoop? If yes, point your
> PIG_CLASSPATH to it.
>
> export PIG_CLASSPATH=/path/hadoop/conf.
>
> Sent from my iPhone
>
> On Oct 26, 2012, at 10:35 AM, jamal sasha <[EMAIL PROTECTED]> wrote:
>
>> Hi
>> Great catch
>> Now I get an error
>> Cannot find hadoop configuration in class path ( neither hadoop site XML
>> etc)
>>
>> So I am running the file on a cluster which had say hadoop set up as
>>
>> /path/hadoop
>> /path/pig
>>
>> And I have account in it
>> So I cannot change the hadoop conf files as other users are also using
it.
>> How do I run this just for me ?
>> On Thursday, October 25, 2012, Bill Graham <[EMAIL PROTECTED]> wrote:
>>> Somewhere you have a typo, probably in the execution of your program:
>>>
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> org/pache/pig/Main****
>>>
>>> Caused by: java.lang.ClassNotFoundException: org.pache.pig.Main****
>>>
>>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)****
>>>
>>> Note that the 'a' in apache is missing.
>>>
>>>
>>> On Thu, Oct 25, 2012 at 2:46 PM, jamal sasha <[EMAIL PROTECTED]>
>> wrote:
>>>
>>>> Hi,****
>>>>
>>>>  I am trying to write a pig udf function.. Basically the data is of
>>>> format*
>>>> ***
>>>>
>>>> ** **
>>>>
>>>> Id,time****
>>>>
>>>> What I am trying to do is … parse the time and then see whether its
>>>> breakfast, lunch or dinner.. based on the time stamp. Some entries wil
be
>>>> null as well..****
>>>>
>>>> ** **
>>>>
>>>> So here is the udf code for this.****
>>>>
>>>> ** **
>>>>
>>>> *public* *class* time *extends* EvalFunc<String>{****
>>>>
>>>> ** **
>>>>
>>>>       *public* String exec(Tuple input) *throws* IOException {****
>>>>
>>>>       ****
>>>>
>>>>              *if* ((*input* == *null*) || (input.size() == 0))****
>>>>
>>>>               *return* *null*;****
>>>>
>>>>           *try*{****
>>>>
>>>>               String time = (String) input.get(0) ;****
>>>>
>>>>               DateFormat df = *new*
>> SimpleDateFormat("hh:mm:ss.000");****
>>>>
>>>>               Date date = df.parse(time);****
>>>>
>>>>               String timeOfDay = *getTimeOfDay*(date);****
>>>>
>>>>               *return* timeOfDay;****
>>>>
>>>>           } *catch* (ParseException e) {****
>>>>
>>>>               //how will I handle when df.parse(time) fails and throws
>>>> ParseException?****
>>>>
>>>>               //maybe:****
>>>>
>>>>               *return* *null*;****
>>>>
>>>>           }****
>>>>
>>>>              ****
>>>>
>>>>              ****
>>>>
>>>>       }****
>>>>
>>>> ** **
>>>>
>>>> After this.. in eclipse.. I did the export of this as a jar called
myudfs
>>>> and I have a jar file called myudfs.jar****
>>>>
>>>> ** **
>>>>
>>>> Then I wrote the pig script as ****
>>>>
>>>> Time.pig****
>>>>
>>>> ** **
>>>>
>>>> REGISTER path/to/udf/myudfs.jar****
>>>>
>>>> in = LOAD 'path/to/input' USING PigStorage(',') AS (id:long,
>>>> time:chararray);****
>>>>
>>>> result = f
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB