Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 91 to 100 from 220 (0.08s).
Loading phrases to help you
refine your search...
[PIG-3845] Use shared edge with no multiquery - Pig - [issue]
...Currently we write map output once for every split and consume in downstream vertices. Need a custom Shared edge implementation in Tez where we can write once and read from all successor ver...
http://issues.apache.org/jira/browse/PIG-3845    Author: Rohini Palaniswamy, 2014-05-30, 20:17
[PIG-3847] Sort avoidance for group by and join - Pig - [issue]
...Group by and join only require that the records be grouped together by key. It is not necessary for the keys to be sorted. If we can have a Tez Input/Output implementation that does the grou...
http://issues.apache.org/jira/browse/PIG-3847    Author: Rohini Palaniswamy, 2014-05-30, 20:16
[PIG-3848] Dynamically switch to replicate join - Pig - [issue]
...   If data sizes are found to be small after filtering then switch to replicate join dynamically if user has not specified "using" clause explicitly. But this required support from...
http://issues.apache.org/jira/browse/PIG-3848    Author: Rohini Palaniswamy, 2014-05-30, 20:16
[PIG-3849] Optimize group by followed by join on the same key - Pig - [issue]
... This can be done in one vertex with multiple inputs instead of having an extra vertex to do the join. i.e Currently Vertex 1 (load relation1) > Vertex 2 (group by) -> Vertex 4 (j...
http://issues.apache.org/jira/browse/PIG-3849    Author: Rohini Palaniswamy, 2014-05-30, 20:16
[PIG-3850] Optimize join followed by order by using same key - Pig - [issue]
...Possible optimizations:    1) If it is a skewed join, then we can combine ordering into it instead of doing a additional orderby as we skewed join already involves sampling.&n...
http://issues.apache.org/jira/browse/PIG-3850    Author: Rohini Palaniswamy, 2014-05-30, 20:16
[PIG-3852] Remove SecurityHelper class for Tez and use Tez helpers instead - Pig - [issue]
...TEZ jira have been created to support mapreduce.job.hdfs-servers and mapreduce.job.credentials.binary. So we can get rid of SecurityHelper.java written for Pig on Tez....
http://issues.apache.org/jira/browse/PIG-3852    Author: Rohini Palaniswamy, 2014-05-30, 20:15
[PIG-3856] UnionOptimizer in Tez should optimize the case of replicated join - Pig - [issue]
...Replicate join input that was broadcast to union vertex now needs to be broadcast to all the union predecessors. So we need to Create edges from the Replicate join input to all the union pre...
http://issues.apache.org/jira/browse/PIG-3856    Author: Rohini Palaniswamy, 2014-05-30, 20:15
[PIG-3891] FileBasedOutputSizeReader does not calculate size of files in sub-directories - Pig - [issue]
...FileBasedOutputSizeReader only includes files in the top level output directory. So if files are stored under subdirectories (For eg: MultiStorage), it does not have the bytes written correc...
http://issues.apache.org/jira/browse/PIG-3891    Author: Rohini Palaniswamy, 2014-05-30, 01:53
Re: Sampling in operations like Order by - Pig - [mail # dev]
...If there is just one reducer there is no need for sampling (PIG-2784), butwhen there is more than one reducer in order by you need to sample the dataand determine the partition ranges so tha...
   Author: Rohini Palaniswamy, 2014-05-22, 18:33
[PIG-2672] Optimize the use of DistributedCache - Pig - [issue]
...Pig currently copies jar files to a temporary location in hdfs and then adds them to DistributedCache for each job launched. This is inefficient in terms of Space - The jars are distributed...
http://issues.apache.org/jira/browse/PIG-2672    Author: Rohini Palaniswamy, 2014-05-20, 19:14
Sort:
project
Pig (220)
Tez (40)
Hive (6)
Bigtop (1)
HBase (1)
HDFS (1)
type
issue (153)
mail # dev (43)
mail # user (24)
date
last 7 days (9)
last 30 days (23)
last 90 days (36)
last 6 months (98)
last 9 months (220)
author
Daniel Dai (405)
Dmitriy Ryaboy (345)
Alan Gates (334)
Cheolsoo Park (271)
Jonathan Coveney (230)
Rohini Palaniswamy (174)
Russell Jurney (173)
Olga Natkovich (131)
Bill Graham (130)
Prashant Kommireddi (110)
Julien Le Dem (81)
Aniket Mokashi (79)
Thejas Nair (70)
Thejas M Nair (63)
Mridul Muralidharan (61)
Ashutosh Chauhan (42)
pi song (41)
Gianmarco De Francisci Mo...(39)
Koji Noguchi (38)
liyunzhang_intel (37)
Pradeep Gollakota (36)
Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Jeff Zhang (32)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB