Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Create rdbms like sequence in Pig on Pig Relation


Copy link to this message
-
Re: Create rdbms like sequence in Pig on Pig Relation
It helps, but I am not able to invoke java.util.UUID.toString, maybe
because it doesn't take an argument.  This is from the docs:

DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String
String');
encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded,
'UTF-8');
Maybe I forgot, but is this how I do it?

DEFINE UUID InvokeForString('java.util.UUID.toString');
with_uuid = FOREACH my_stuff generate UUID(), *;
Sorry, I only understand example code - not APIs. My Java is quite weak.

http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#toString()

On Sun, May 27, 2012 at 2:33 AM, Subir S <[EMAIL PROTECTED]> wrote:

> I hope this helps. DynamicInvoker feature in Pig. Added in 0.8.0
>
>
> http://squarecog.wordpress.com/2010/08/20/upcoming-features-in-pig-0-8-dynamic-invokers/
>
> Thanks
>
> On 5/24/12, Russell Jurney <[EMAIL PROTECTED]> wrote:
> > Thanks, I mean how do you invoke it directly in grunt> from Pig?
> >
> > I keep messing it up for the last 30 minutes. Should I check the settings
> > on my pacemaker, I feel like Fabio on NyQuil messing with this.
> >
> > On Wed, May 23, 2012 at 10:19 PM, Subir S <[EMAIL PROTECTED]>
> > wrote:
> >
> >> Hope this helps ->
> >> http://www.javapractices.com/topic/TopicAction.do?Id=56
> >>
> >> and this ->
> >>
> >>
> http://docs.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#randomUUID%28%29
> >>
> >> Thanks
> >>
> >>
> >>
> >> On Thu, May 24, 2012 at 10:42 AM, Russell Jurney
> >> <[EMAIL PROTECTED]>wrote:
> >>
> >> > How do you invoke java.util.UUID.randomUUID?  There is no invoker that
> >> > doesn't take an arg?
> >> >
> >> > On Sun, May 20, 2012 at 6:26 PM, Rajesh Balamohan <
> >> > [EMAIL PROTECTED]> wrote:
> >> >
> >> > > I dont think so. However, its a single line java command. You can
> >> create
> >> > > customUDF for this and use in your code.
> >> > >
> >> > > java.util.UUID.randomUUID();
> >> > >
> >> > > ~Rajesh.B
> >> > >
> >> > > On Sun, May 20, 2012 at 8:15 AM, DIPESH KUMAR SINGH
> >> > > <[EMAIL PROTECTED]>wrote:
> >> > >
> >> > > > Thanks Rajesh.
> >> > > >
> >> > > > Is GUID a built in UDF?
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Dipesh
> >> > > >
> >> > > > On Sun, May 20, 2012 at 8:06 AM, Rajesh Balamohan <
> >> > > > [EMAIL PROTECTED]> wrote:
> >> > > >
> >> > > > > If you do not bother about sequence number and the intention is
> >> > > > > to
> >> > > create
> >> > > > > just unique key, you can just use GUID which doesn't require any
> >> > > > > synchronization at all (all mappers can run in parallel).
> >> > > > >
> >> > > > > The approached I suggested in earlier mail comes into picture
> >> mainly
> >> > > for
> >> > > > > sequence number.
> >> > > > >
> >> > > > > ~Rajesh.B
> >> > > > >
> >> > > > > On Sun, May 20, 2012 at 8:02 AM, Rajesh Balamohan <
> >> > > > > [EMAIL PROTECTED]> wrote:
> >> > > > >
> >> > > > > > Pig doesn't have that facility yet. Moreover, its not very
> >> > efficient
> >> > > to
> >> > > > > do
> >> > > > > > this in PIG/MR as it requires synchronization.
> >> > > > > >
> >> > > > > > However, if this is unavoidable situation for you, following
> >> things
> >> > > can
> >> > > > > be
> >> > > > > > considered
> >> > > > > >
> >> > > > > > 1. Maintaining the seq number details in zookeeper
> >> > > > > > 2. Having a simple structure in HBase table (seqNumber -->
> >> Value).
> >> > > You
> >> > > > > can
> >> > > > > > get a bucket of values (ex: 1000-2000) from this and use it in
> >> your
> >> > > > UDF.
> >> > > > > > When the range depletes, you have to query/update HBase table
> >> (ex:
> >> > > > > > 3000-4000). There are corner cases which needs to be handled.
> >> > > > > >
> >> > > > > >
> >> > > > > > ~Rajesh.B
> >> > > > > >
> >> > > > > >
> >> > > > > > On Sat, May 19, 2012 at 12:04 AM, DIPESH KUMAR SINGH <
> >> > > > > > [EMAIL PROTECTED]> wrote:

Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com