Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 10 (0.099s).
Loading phrases to help you
refine your search...
GROUP ALL Partitioning - Pig - [mail # user]
...Hi there,Just curious, can anyone provide a quick explanation or link to the sourcecode of how Pig partitions data on a GROUP alias ALL operation?  We'reseeing some odd behaviour, likel...
   Author: Mike Sukmanowsky, 2014-01-23, 19:38
[expand - 1 more] - Re: Log File Versioning and Pig - Pig - [mail # user]
...Thanks Pradeep - none of our logs currently use Proto Buf/Thrift/Avro and we were somewhat trying to stay away from these guys but they may be a good option.   On Thu, Dec 12, 2013 at 6...
   Author: Mike Sukmanowsky, 2013-12-13, 14:42
Bug in ILLUSTRATE operator - Pig - [mail # user]
...Was going to file in JIRA, but wanted to reach out here first to see if I'm just going crazy.  When using 0.11.2-SNAPSHOT I'm seeing errors only when using ILLUSTRATE (dump and describe...
   Author: Mike Sukmanowsky, 2013-08-29, 14:34
Distinct IDs from different time periods - Pig - [mail # user]
...Hi all,  Trying to produce some data using clickstream logs from Pig that does the following:     1. Pull data for the past 30 days (current period)    2. Classify G...
   Author: Mike Sukmanowsky, 2013-08-13, 20:32
Re: Welcome our newest committer Prashant Kommireddi - Pig - [mail # user]
...Congrats!   On Thu, May 2, 2013 at 3:56 PM, Julien Le Dem  wrote:      Mike Sukmanowsky  Product Lead, http://parse.ly 989 Avenue of the Americas, 3rd Floor New...
   Author: Mike Sukmanowsky, 2013-05-02, 22:41
Re: Pig write to single file - Pig - [mail # user]
...How many output files are you getting?  You can set SET DEFAULT_PARALLEL 1; so you don't have to specify parallelism on each reduce phase.  In general though, I wouldn't recommend ...
   Author: Mike Sukmanowsky, 2013-05-01, 17:17
Re: Don't process already processed files? - Pig - [mail # user]
...It's probably less work to have some kind of a script control Pig execution and keep track of what's been processed and pass in an input path to your Pig script dynamically.  For exampl...
   Author: Mike Sukmanowsky, 2013-03-27, 14:05
Re: Efficient load for data with large number of columns - Pig - [mail # user]
...Yes, as of Pig 0.10.0 you can specify a schema file along with PigStorage when loading or storing data see http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/PigStorage.html . &nb...
   Author: Mike Sukmanowsky, 2013-03-27, 14:01
[expand - 2 more] - Re: EvalFunc finish() closing connections prematurely - Pig - [mail # user]
...Bump - any thoughts?   On Mon, Mar 18, 2013 at 4:53 PM, Mike Sukmanowsky  wrote:     Mike Sukmanowsky  Product Lead, http://parse.ly 989 Avenue of the Americas, 3rd ...
   Author: Mike Sukmanowsky, 2013-03-22, 15:28
[expand - 1 more] - Re: nested order limit by percentage of overall records - Pig - [mail # user]
...Distributed quantiles aren't an easy problem to solve (as you can see from LinkedIn's source) but perhaps in time they'll be brought into core functions.  It wasn't until 0.11.0 that da...
   Author: Mike Sukmanowsky, 2013-03-18, 23:23
Sort:
project
Pig (10)
Hadoop (1)
type
mail # user (10)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (10)
author
Daniel Dai (396)
Dmitriy Ryaboy (346)
Alan Gates (335)
Cheolsoo Park (310)
Jonathan Coveney (237)
Rohini Palaniswamy (185)
Russell Jurney (176)
Bill Graham (131)
Olga Natkovich (131)
Prashant Kommireddi (107)
Aniket Mokashi (87)
Julien Le Dem (84)
Thejas Nair (70)
Thejas M Nair (63)
Mridul Muralidharan (61)
Ashutosh Chauhan (41)
pi song (41)
Gianmarco De Francisci Mo...(38)
Koji Noguchi (38)
"Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Pradeep Gollakota (34)
Jeff Zhang (32)
Santhosh Srinivasan (29)
Mike Sukmanowsky