Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Create rdbms like sequence in Pig on Pig Relation


Copy link to this message
-
Re: Create rdbms like sequence in Pig on Pig Relation
It helps, but I am not able to invoke java.util.UUID.toString, maybe
because it doesn't take an argument.  This is from the docs:

DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String
String');
encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded,
'UTF-8');
Maybe I forgot, but is this how I do it?

DEFINE UUID InvokeForString('java.util.UUID.toString');
with_uuid = FOREACH my_stuff generate UUID(), *;
Sorry, I only understand example code - not APIs. My Java is quite weak.

http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#toString()

On Sun, May 27, 2012 at 2:33 AM, Subir S <[EMAIL PROTECTED]> wrote:

> I hope this helps. DynamicInvoker feature in Pig. Added in 0.8.0
>
>
> http://squarecog.wordpress.com/2010/08/20/upcoming-features-in-pig-0-8-dynamic-invokers/
>
> Thanks
>
> On 5/24/12, Russell Jurney <[EMAIL PROTECTED]> wrote:
> > Thanks, I mean how do you invoke it directly in grunt> from Pig?
> >
> > I keep messing it up for the last 30 minutes. Should I check the settings
> > on my pacemaker, I feel like Fabio on NyQuil messing with this.
> >
> > On Wed, May 23, 2012 at 10:19 PM, Subir S <[EMAIL PROTECTED]>
> > wrote:
> >
> >> Hope this helps ->
> >> http://www.javapractices.com/topic/TopicAction.do?Id=56
> >>
> >> and this ->
> >>
> >>
> http://docs.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#randomUUID%28%29
> >>
> >> Thanks
> >>
> >>
> >>
> >> On Thu, May 24, 2012 at 10:42 AM, Russell Jurney
> >> <[EMAIL PROTECTED]>wrote:
> >>
> >> > How do you invoke java.util.UUID.randomUUID?  There is no invoker that
> >> > doesn't take an arg?
> >> >
> >> > On Sun, May 20, 2012 at 6:26 PM, Rajesh Balamohan <
> >> > [EMAIL PROTECTED]> wrote:
> >> >
> >> > > I dont think so. However, its a single line java command. You can
> >> create
> >> > > customUDF for this and use in your code.
> >> > >
> >> > > java.util.UUID.randomUUID();
> >> > >
> >> > > ~Rajesh.B
> >> > >
> >> > > On Sun, May 20, 2012 at 8:15 AM, DIPESH KUMAR SINGH
> >> > > <[EMAIL PROTECTED]>wrote:
> >> > >
> >> > > > Thanks Rajesh.
> >> > > >
> >> > > > Is GUID a built in UDF?
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Dipesh
> >> > > >
> >> > > > On Sun, May 20, 2012 at 8:06 AM, Rajesh Balamohan <
> >> > > > [EMAIL PROTECTED]> wrote:
> >> > > >
> >> > > > > If you do not bother about sequence number and the intention is
> >> > > > > to
> >> > > create
> >> > > > > just unique key, you can just use GUID which doesn't require any
> >> > > > > synchronization at all (all mappers can run in parallel).
> >> > > > >
> >> > > > > The approached I suggested in earlier mail comes into picture
> >> mainly
> >> > > for
> >> > > > > sequence number.
> >> > > > >
> >> > > > > ~Rajesh.B
> >> > > > >
> >> > > > > On Sun, May 20, 2012 at 8:02 AM, Rajesh Balamohan <
> >> > > > > [EMAIL PROTECTED]> wrote:
> >> > > > >
> >> > > > > > Pig doesn't have that facility yet. Moreover, its not very
> >> > efficient
> >> > > to
> >> > > > > do
> >> > > > > > this in PIG/MR as it requires synchronization.
> >> > > > > >
> >> > > > > > However, if this is unavoidable situation for you, following
> >> things
> >> > > can
> >> > > > > be
> >> > > > > > considered
> >> > > > > >
> >> > > > > > 1. Maintaining the seq number details in zookeeper
> >> > > > > > 2. Having a simple structure in HBase table (seqNumber -->
> >> Value).
> >> > > You
> >> > > > > can
> >> > > > > > get a bucket of values (ex: 1000-2000) from this and use it in
> >> your
> >> > > > UDF.
> >> > > > > > When the range depletes, you have to query/update HBase table
> >> (ex:
> >> > > > > > 3000-4000). There are corner cases which needs to be handled.
> >> > > > > >
> >> > > > > >
> >> > > > > > ~Rajesh.B
> >> > > > > >
> >> > > > > >
> >> > > > > > On Sat, May 19, 2012 at 12:04 AM, DIPESH KUMAR SINGH <
> >> > > > > > [EMAIL PROTECTED]> wrote:

Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB