|
Tamir Kamara
2010-03-10, 08:52
Jeff Zhang
2010-03-10, 09:21
Jeff Zhang
2010-03-10, 09:25
Tamir Kamara
2010-03-10, 09:36
Tamir Kamara
2010-03-10, 10:21
Jeff Zhang
2010-03-10, 10:28
Tamir Kamara
2010-03-10, 10:33
Alan Gates
2010-03-15, 18:49
Corbin Hoenes
2010-03-15, 20:52
Dmitriy Ryaboy
2010-03-15, 21:08
zaki rahaman
2010-03-15, 21:31
zaki rahaman
2010-03-15, 21:31
|
-
Using external jar in UDFTamir Kamara 2010-03-10, 08:52
Hi,
I have a function (eval) that needs to use an external jar. In M/R world this can be accomplished by uploading the jar to the dfs and using DistributedCache.addFileToClassPath. How do I do the same (have the jar available for the udf) in pig? Thanks, Tamir
-
Re: Using external jar in UDFJeff Zhang 2010-03-10, 09:21
Using *REGISTER myfunc.jar;*
refer here: http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> wrote: > Hi, > > I have a function (eval) that needs to use an external jar. > In M/R world this can be accomplished by uploading the jar to the dfs and > using DistributedCache.addFileToClassPath. > How do I do the same (have the jar available for the udf) in pig? > > Thanks, > Tamir > -- Best Regards Jeff Zhang
-
Re: Using external jar in UDFJeff Zhang 2010-03-10, 09:25
Sorry maybe I misunderstand you. It seems you'd like to use third-party
library in your udf, then you need to package your udf and third-party library in one jar. On Wed, Mar 10, 2010 at 5:21 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > Using *REGISTER myfunc.jar;* > > refer here: > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > > > On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> I have a function (eval) that needs to use an external jar. >> In M/R world this can be accomplished by uploading the jar to the dfs and >> using DistributedCache.addFileToClassPath. >> How do I do the same (have the jar available for the udf) in pig? >> >> Thanks, >> Tamir >> > > > > -- > Best Regards > > Jeff Zhang > -- Best Regards Jeff Zhang
-
Re: Using external jar in UDFTamir Kamara 2010-03-10, 09:36
Hi Jeff,
You are right - I want to use another jar in my own udf. Packaging both into a single jar is certainly an option but I was hoping pig would be able to do something similar to regular map-reduce where I push the jar before hand to the DFS and then add it to the class path via the distributed cache. Thanks, Tamir On Wed, Mar 10, 2010 at 11:25 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > Sorry maybe I misunderstand you. It seems you'd like to use third-party > library in your udf, then you need to package your udf and third-party > library in one jar. > > > On Wed, Mar 10, 2010 at 5:21 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > > > Using *REGISTER myfunc.jar;* > > > > refer here: > > > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > > > > > > > On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED] > >wrote: > > > >> Hi, > >> > >> I have a function (eval) that needs to use an external jar. > >> In M/R world this can be accomplished by uploading the jar to the dfs > and > >> using DistributedCache.addFileToClassPath. > >> How do I do the same (have the jar available for the udf) in pig? > >> > >> Thanks, > >> Tamir > >> > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > > > > > -- > Best Regards > > Jeff Zhang >
-
Re: Using external jar in UDFTamir Kamara 2010-03-10, 10:21
Hi,
Register is working fine but it means that the user needs to know when it's needed to register the additional jar. What about my question regarding the M/R way of doing this ? Thanks, Tamir On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > Using *REGISTER myfunc.jar;* > > refer here: > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > > On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> > wrote: > > > Hi, > > > > I have a function (eval) that needs to use an external jar. > > In M/R world this can be accomplished by uploading the jar to the dfs and > > using DistributedCache.addFileToClassPath. > > How do I do the same (have the jar available for the udf) in pig? > > > > Thanks, > > Tamir > > > > > > -- > Best Regards > > Jeff Zhang >
-
Re: Using external jar in UDFJeff Zhang 2010-03-10, 10:28
Sorry, what do you mean M/R way ? Actually you do not have way to touch the
M/R code in pig. On Wed, Mar 10, 2010 at 6:21 PM, Tamir Kamara <[EMAIL PROTECTED]> wrote: > Hi, > > Register is working fine but it means that the user needs to know when it's > needed to register the additional jar. What about my question regarding the > M/R way of doing this ? > > Thanks, > Tamir > > On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > > > Using *REGISTER myfunc.jar;* > > > > refer here: > > > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > > > > > On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> > > wrote: > > > > > Hi, > > > > > > I have a function (eval) that needs to use an external jar. > > > In M/R world this can be accomplished by uploading the jar to the dfs > and > > > using DistributedCache.addFileToClassPath. > > > How do I do the same (have the jar available for the udf) in pig? > > > > > > Thanks, > > > Tamir > > > > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > > -- Best Regards Jeff Zhang
-
Re: Using external jar in UDFTamir Kamara 2010-03-10, 10:33
Hi,
In M/R when you need an extra jar to use do you add the jar into the class path by calling: DistributedCache.addFileToClassPath(dfs-path-to-jar); I imagine that the register command does something similar under the covers but I was just looking for a way to have the UDF load its own dependency jar and thus not leaving it up to the user to remember to issue the second register command (for the dependency jar) on it own. Thanks, Tamir On Wed, Mar 10, 2010 at 12:28 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > Sorry, what do you mean M/R way ? Actually you do not have way to touch the > M/R code in pig. > > On Wed, Mar 10, 2010 at 6:21 PM, Tamir Kamara <[EMAIL PROTECTED]> > wrote: > > > Hi, > > > > Register is working fine but it means that the user needs to know when > it's > > needed to register the additional jar. What about my question regarding > the > > M/R way of doing this ? > > > > Thanks, > > Tamir > > > > On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > > > > > Using *REGISTER myfunc.jar;* > > > > > > refer here: > > > > > > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > > > > > > > > On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> > > > wrote: > > > > > > > Hi, > > > > > > > > I have a function (eval) that needs to use an external jar. > > > > In M/R world this can be accomplished by uploading the jar to the dfs > > and > > > > using DistributedCache.addFileToClassPath. > > > > How do I do the same (have the jar available for the udf) in pig? > > > > > > > > Thanks, > > > > Tamir > > > > > > > > > > > > > > > > -- > > > Best Regards > > > > > > Jeff Zhang > > > > > > > > > -- > Best Regards > > Jeff Zhang >
-
Re: Using external jar in UDFAlan Gates 2010-03-15, 18:49
The UDF interface does not currently include the ability for a UDF to
indicate additional jars it would like to have packaged and sent along. Alan. On Mar 10, 2010, at 2:21 AM, Tamir Kamara wrote: > Hi, > > Register is working fine but it means that the user needs to know > when it's > needed to register the additional jar. What about my question > regarding the > M/R way of doing this ? > > Thanks, > Tamir > > On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > >> Using *REGISTER myfunc.jar;* >> >> refer here: >> http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER >> >> >> On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> >> wrote: >> >>> Hi, >>> >>> I have a function (eval) that needs to use an external jar. >>> In M/R world this can be accomplished by uploading the jar to the >>> dfs and >>> using DistributedCache.addFileToClassPath. >>> How do I do the same (have the jar available for the udf) in pig? >>> >>> Thanks, >>> Tamir >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >>
-
Re: Using external jar in UDFCorbin Hoenes 2010-03-15, 20:52
Okay what do you mean by "package and send along"? What is the pig way to include additional jars? e.g. we want to use a 3rd party library to encode json and how can our UDF reference that jar?
On Mar 15, 2010, at 12:49 PM, Alan Gates wrote: > The UDF interface does not currently include the ability for a UDF to indicate additional jars it would like to have packaged and sent along. > > Alan. > > On Mar 10, 2010, at 2:21 AM, Tamir Kamara wrote: > >> Hi, >> >> Register is working fine but it means that the user needs to know when it's >> needed to register the additional jar. What about my question regarding the >> M/R way of doing this ? >> >> Thanks, >> Tamir >> >> On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: >> >>> Using *REGISTER myfunc.jar;* >>> >>> refer here: >>> http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER >>> >>> >>> On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> >>> wrote: >>> >>>> Hi, >>>> >>>> I have a function (eval) that needs to use an external jar. >>>> In M/R world this can be accomplished by uploading the jar to the dfs and >>>> using DistributedCache.addFileToClassPath. >>>> How do I do the same (have the jar available for the udf) in pig? >>>> >>>> Thanks, >>>> Tamir >>>> >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >>> >
-
Re: Using external jar in UDFDmitriy Ryaboy 2010-03-15, 21:08
Your UDF will reference the classes the regular way - just use imports. The
trick is to make sure the jars are on the machine & classpath. Two ways to do this -- pre-load them on the cluster and have them configured to be on the default classpath, or use Pig's "REGISTER" keyword to register both your UDF jar and the dependencies (once per each jar). What Alan is saying, there is no way to create a udf that would somehow tell pig that it needs to package up and send a jar file located somewhere on the client machine -- you have to do that in the pig script yourself. Additionally, thanks to Thejas, you can register jars on the command line if you are on Pig 0.7 (trunk): https://issues.apache.org/jira/browse/PIG-1226 On Mon, Mar 15, 2010 at 1:52 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > Okay what do you mean by "package and send along"? What is the pig way to > include additional jars? e.g. we want to use a 3rd party library to encode > json and how can our UDF reference that jar? > > On Mar 15, 2010, at 12:49 PM, Alan Gates wrote: > > > The UDF interface does not currently include the ability for a UDF to > indicate additional jars it would like to have packaged and sent along. > > > > Alan. > > > > On Mar 10, 2010, at 2:21 AM, Tamir Kamara wrote: > > > >> Hi, > >> > >> Register is working fine but it means that the user needs to know when > it's > >> needed to register the additional jar. What about my question regarding > the > >> M/R way of doing this ? > >> > >> Thanks, > >> Tamir > >> > >> On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > >> > >>> Using *REGISTER myfunc.jar;* > >>> > >>> refer here: > >>> > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > >>> > >>> > >>> On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> I have a function (eval) that needs to use an external jar. > >>>> In M/R world this can be accomplished by uploading the jar to the dfs > and > >>>> using DistributedCache.addFileToClassPath. > >>>> How do I do the same (have the jar available for the udf) in pig? > >>>> > >>>> Thanks, > >>>> Tamir > >>>> > >>> > >>> > >>> > >>> -- > >>> Best Regards > >>> > >>> Jeff Zhang > >>> > > > >
-
Re: Using external jar in UDFzaki rahaman 2010-03-15, 21:31
Hey Corbin,
Alternatively, you could use whatever build tool you're using (Maven, Ant) and include the JSON library as a dependency and configure so that you can have it build a jar with dependencies. On Mon, Mar 15, 2010 at 5:08 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Your UDF will reference the classes the regular way - just use imports. The > trick is to make sure the jars are on the machine & classpath. Two ways to > do this -- pre-load them on the cluster and have them configured to be on > the default classpath, or use Pig's "REGISTER" keyword to register both > your > UDF jar and the dependencies (once per each jar). What Alan is saying, > there is no way to create a udf that would somehow tell pig that it needs > to > package up and send a jar file located somewhere on the client machine -- > you have to do that in the pig script yourself. > > Additionally, thanks to Thejas, you can register jars on the command line > if > you are on Pig 0.7 (trunk): https://issues.apache.org/jira/browse/PIG-1226 > > > On Mon, Mar 15, 2010 at 1:52 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > > > Okay what do you mean by "package and send along"? What is the pig way > to > > include additional jars? e.g. we want to use a 3rd party library to > encode > > json and how can our UDF reference that jar? > > > > On Mar 15, 2010, at 12:49 PM, Alan Gates wrote: > > > > > The UDF interface does not currently include the ability for a UDF to > > indicate additional jars it would like to have packaged and sent along. > > > > > > Alan. > > > > > > On Mar 10, 2010, at 2:21 AM, Tamir Kamara wrote: > > > > > >> Hi, > > >> > > >> Register is working fine but it means that the user needs to know when > > it's > > >> needed to register the additional jar. What about my question > regarding > > the > > >> M/R way of doing this ? > > >> > > >> Thanks, > > >> Tamir > > >> > > >> On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> > wrote: > > >> > > >>> Using *REGISTER myfunc.jar;* > > >>> > > >>> refer here: > > >>> > > > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > > >>> > > >>> > > >>> On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED] > > > > >>> wrote: > > >>> > > >>>> Hi, > > >>>> > > >>>> I have a function (eval) that needs to use an external jar. > > >>>> In M/R world this can be accomplished by uploading the jar to the > dfs > > and > > >>>> using DistributedCache.addFileToClassPath. > > >>>> How do I do the same (have the jar available for the udf) in pig? > > >>>> > > >>>> Thanks, > > >>>> Tamir > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Best Regards > > >>> > > >>> Jeff Zhang > > >>> > > > > > > > > -- Zaki Rahaman
-
Re: Using external jar in UDFzaki rahaman 2010-03-15, 21:31
Hey,
How's the progress on teh JSON UDF? If you post it on the Pig JIRA I could get a chance to take a look and help out. Also it would get the ball rolling on getting the UDF added to piggybank On Mon, Mar 15, 2010 at 4:52 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > Okay what do you mean by "package and send along"? What is the pig way to > include additional jars? e.g. we want to use a 3rd party library to encode > json and how can our UDF reference that jar? > > On Mar 15, 2010, at 12:49 PM, Alan Gates wrote: > > > The UDF interface does not currently include the ability for a UDF to > indicate additional jars it would like to have packaged and sent along. > > > > Alan. > > > > On Mar 10, 2010, at 2:21 AM, Tamir Kamara wrote: > > > >> Hi, > >> > >> Register is working fine but it means that the user needs to know when > it's > >> needed to register the additional jar. What about my question regarding > the > >> M/R way of doing this ? > >> > >> Thanks, > >> Tamir > >> > >> On Wed, Mar 10, 2010 at 11:21 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > >> > >>> Using *REGISTER myfunc.jar;* > >>> > >>> refer here: > >>> > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#REGISTER > >>> > >>> > >>> On Wed, Mar 10, 2010 at 4:52 PM, Tamir Kamara <[EMAIL PROTECTED]> > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> I have a function (eval) that needs to use an external jar. > >>>> In M/R world this can be accomplished by uploading the jar to the dfs > and > >>>> using DistributedCache.addFileToClassPath. > >>>> How do I do the same (have the jar available for the udf) in pig? > >>>> > >>>> Thanks, > >>>> Tamir > >>>> > >>> > >>> > >>> > >>> -- > >>> Best Regards > >>> > >>> Jeff Zhang > >>> > > > > -- Zaki Rahaman |