Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 37 (0.072s).
Loading phrases to help you
refine your search...
Re: Is it possible to fix MR jobs order in Pig? - Pig - [mail # user]
...Rodrigo,I see you're using pig 0.9? The latest code (pig 0.13) is better about preserving order when building the execution plan. See PIG-3902 (https://issues.apache.org/jira/browse/PIG-3902...
   Author: Jacob Perkins, 2014-07-20, 17:32
[expand - 1 more] - Re: Thoughts - Pig - [mail # user]
...Rodrigo,Write your own StoreFunc and use it instead of the udf. PigStorage can already write to s3 so it should be straightforward to simply subclass that.@thedatachefOn Jul 16, 2014, at 2:2...
   Author: Jacob Perkins, 2014-07-16, 14:24
[PIG-3902] PigServer creates cycle - Pig - [issue]
...Under certain conditions PigServer creates a cycle in the logical plan. Consider the following pseudocode:A = load from 'A' using F1;...process...B = store X into 'B' using F2;C = load from ...
http://issues.apache.org/jira/browse/PIG-3902    Author: Jacob Perkins, 2014-05-10, 23:48
Re: What's the equivalent of a GROUP BY statement within a FOREACH statement? - Pig - [mail # user]
...Adam,Take a look at the CountEach udf in the datafu library (http://datafu.incubator.apache.org/docs/datafu/1.2.0/datafu/pig/bags/CountEach.html). Eg:res = foreach raw3 {     ...
   Author: Jacob Perkins, 2014-03-20, 14:27
Re: Managing Large Pig Scripts - Pig - [mail # user]
...Christopher,You might consider breaking it into one or more reusable macros. What version of pig are you using?For complicated scripts, especially if you didn't write them, you might want to...
   Author: Jacob Perkins, 2014-03-05, 15:56
Re: Simple word count in pig.. - Pig - [mail # user]
...Jamal,  You're going to want to use a FLATTEN and another group by. Consider:  flattened   = foreach processed generate id, flatten(tokens) as token; frequency = foreach (grou...
   Author: Jacob Perkins, 2013-11-20, 12:54
Re: Create Table + Join + Max 'String' Date - Pig - [mail # user]
...Abhishek,  The cogroup operator and a filter should get you what you want:  t1_filtered = filter table1 by reporting_dt  wrote:  ...
   Author: Jacob Perkins, 2013-09-10, 11:57
Re: Dedupe Logic - Pig - [mail # user]
...Abhishek,  You should be able to do this by grouping by the three columns and then ordering by the fourth in a nested foreach.  eg:  data = load 'some_url' as (f11, f12, f13, ...
   Author: Jacob Perkins, 2013-08-24, 18:19
Re: multiple file storage with pig - Pig - [mail # user]
...Pablo,  For your first question what you want to do is called a projection of your "grouped" relation. Something like this should work:  grouped = foreach (group cleaned by (timest...
   Author: Jacob Perkins, 2013-07-30, 12:57
[expand - 1 more] - Re: Iterating over data set - Pig - [mail # user]
...Xuri,  I don't think you can use functions in the load statement like that. To do something like that you'd need to write your own LoadFunc. As far as I can tell at a glance, and I have...
   Author: Jacob Perkins, 2013-07-30, 12:34
Sort:
project
Pig (37)
Cassandra (1)
type
mail # user (35)
issue (2)
date
last 7 days (0)
last 30 days (0)
last 90 days (2)
last 6 months (3)
last 9 months (37)
author
Daniel Dai (396)
Dmitriy Ryaboy (346)
Alan Gates (335)
Cheolsoo Park (310)
Jonathan Coveney (237)
Rohini Palaniswamy (186)
Russell Jurney (176)
Bill Graham (131)
Olga Natkovich (131)
Prashant Kommireddi (107)
Aniket Mokashi (87)
Julien Le Dem (84)
Thejas Nair (71)
Thejas M Nair (62)
Mridul Muralidharan (61)
Ashutosh Chauhan (42)
pi song (41)
Gianmarco De Francisci Mo...(38)
Koji Noguchi (38)
"Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Pradeep Gollakota (34)
Jeff Zhang (32)
Santhosh Srinivasan (29)
Jacob Perkins