Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Synthetic keys


Copy link to this message
-
Re: Synthetic keys
One reason you might prefer to use the JOIN a BY 1, b BY 1 syntax is to
specify a type of join such as the replicated join which will increase the
performance.
On Fri, May 24, 2013 at 9:15 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> You can do this, but pig has a CROSS keyword that you can use.
>
>
> 2013/5/23 Mehmet Tepedelenlioglu <[EMAIL PROTECTED]>
>
> > Hi,
> >
> > I am using this:
> >
> > x = join a by 1, b by 1  using 'replicated';
> >
> > with the hope that it generates some synthetic key '1' on both a and b
> and
> > joins it on that key, thereby, in this case, doing a clean map side cross
> > of
> > a and b with no schema changes (exactly the way a cross would work). It
> > seems to be working, but since I just tried it and it worked, I am not
> sure
> > if there is anything in there I should be aware of. Does anyone know?
> >
> > Thanks,
> >
> > Mehmet
> >
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB