Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Create rdbms like sequence in Pig on Pig Relation


Copy link to this message
-
Re: Create rdbms like sequence in Pig on Pig Relation
Russell Jurney 2012-05-24, 05:12
How do you invoke java.util.UUID.randomUUID?  There is no invoker that
doesn't take an arg?

On Sun, May 20, 2012 at 6:26 PM, Rajesh Balamohan <
[EMAIL PROTECTED]> wrote:

> I dont think so. However, its a single line java command. You can create
> customUDF for this and use in your code.
>
> java.util.UUID.randomUUID();
>
> ~Rajesh.B
>
> On Sun, May 20, 2012 at 8:15 AM, DIPESH KUMAR SINGH
> <[EMAIL PROTECTED]>wrote:
>
> > Thanks Rajesh.
> >
> > Is GUID a built in UDF?
> >
> >
> > --
> > Dipesh
> >
> > On Sun, May 20, 2012 at 8:06 AM, Rajesh Balamohan <
> > [EMAIL PROTECTED]> wrote:
> >
> > > If you do not bother about sequence number and the intention is to
> create
> > > just unique key, you can just use GUID which doesn't require any
> > > synchronization at all (all mappers can run in parallel).
> > >
> > > The approached I suggested in earlier mail comes into picture mainly
> for
> > > sequence number.
> > >
> > > ~Rajesh.B
> > >
> > > On Sun, May 20, 2012 at 8:02 AM, Rajesh Balamohan <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Pig doesn't have that facility yet. Moreover, its not very efficient
> to
> > > do
> > > > this in PIG/MR as it requires synchronization.
> > > >
> > > > However, if this is unavoidable situation for you, following things
> can
> > > be
> > > > considered
> > > >
> > > > 1. Maintaining the seq number details in zookeeper
> > > > 2. Having a simple structure in HBase table (seqNumber --> Value).
> You
> > > can
> > > > get a bucket of values (ex: 1000-2000) from this and use it in your
> > UDF.
> > > > When the range depletes, you have to query/update HBase table (ex:
> > > > 3000-4000). There are corner cases which needs to be handled.
> > > >
> > > >
> > > > ~Rajesh.B
> > > >
> > > >
> > > > On Sat, May 19, 2012 at 12:04 AM, DIPESH KUMAR SINGH <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > >> Sorry, if my point was not clear.
> > > >>
> > > >> I wish to create a sequence on a pig relation.
> > > >>
> > > >> Say For example i have a relation with data:
> > > >> (John, A-1)
> > > >> (Jack, B-2)
> > > >> (Jim, C-1)
> > > >>
> > > >> I want to create sequence i.e to add one more column to the
> relation,
> > > like
> > > >> a counter and keep on increasing the count for each record read.
> > > Expected
> > > >> output should be something like this:
> > > >>
> > > >> (If 200 is the start sequence. )
> > > >> (John, A-1, 201)
> > > >> (Jack, B-2, 202)
> > > >> (Jim, C-1, 203)
> > > >>
> > > >> Could you please suggest to proceed on this?
> > > >>
> > > >> Thanks,
> > > >> Dipesh
> > > >>
> > > >> On Fri, May 18, 2012 at 6:50 AM, Thejas Nair <
> [EMAIL PROTECTED]>
> > > >> wrote:
> > > >>
> > > >> > What do you mean by 'rdbms like sequence' ?
> > > >> > Thanks,
> > > >> > Thejas
> > > >> >
> > > >> >
> > > >> > On 5/16/12 10:41 AM, DIPESH KUMAR SINGH wrote:
> > > >> >
> > > >> >> I want to create a rdbms like sequence on a Pig relation.
> > > >> >>
> > > >> >> Is there any existing UDF which could do this?
> > > >> >>
> > > >> >> I am bit new to pig, Kindly suggest how to proceed?
> > > >> >>
> > > >> >>
> > > >> >> Thanks&  Regards,
> > > >> >>
> > > >> >
> > > >> >
> > > >>
> > > >>
> > > >> --
> > > >> Dipesh Kr. Singh
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > ~Rajesh.B
> > > >
> > >
> > >
> > >
> > > --
> > > ~Rajesh.B
> > >
> >
> >
> >
> > --
> > Dipesh Kr. Singh
> >
>
>
>
> --
> ~Rajesh.B
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com