Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 34 (0.164s).
Loading phrases to help you
refine your search...
Re: What's the equivalent of a GROUP BY statement within a FOREACH statement? - Pig - [mail # user]
...Adam,Take a look at the CountEach udf in the datafu library (http://datafu.incubator.apache.org/docs/datafu/1.2.0/datafu/pig/bags/CountEach.html). Eg:res = foreach raw3 {     ...
   Author: Jacob Perkins, 2014-03-20, 14:27
Re: Managing Large Pig Scripts - Pig - [mail # user]
...Christopher,You might consider breaking it into one or more reusable macros. What version of pig are you using?For complicated scripts, especially if you didn't write them, you might want to...
   Author: Jacob Perkins, 2014-03-05, 15:56
Re: Simple word count in pig.. - Pig - [mail # user]
...Jamal,  You're going to want to use a FLATTEN and another group by. Consider:  flattened   = foreach processed generate id, flatten(tokens) as token; frequency = foreach (grou...
   Author: Jacob Perkins, 2013-11-20, 12:54
Re: Create Table + Join + Max 'String' Date - Pig - [mail # user]
...Abhishek,  The cogroup operator and a filter should get you what you want:  t1_filtered = filter table1 by reporting_dt  wrote:  ...
   Author: Jacob Perkins, 2013-09-10, 11:57
Re: Dedupe Logic - Pig - [mail # user]
...Abhishek,  You should be able to do this by grouping by the three columns and then ordering by the fourth in a nested foreach.  eg:  data = load 'some_url' as (f11, f12, f13, ...
   Author: Jacob Perkins, 2013-08-24, 18:19
Re: multiple file storage with pig - Pig - [mail # user]
...Pablo,  For your first question what you want to do is called a projection of your "grouped" relation. Something like this should work:  grouped = foreach (group cleaned by (timest...
   Author: Jacob Perkins, 2013-07-30, 12:57
Re: Iterating over data set - Pig - [mail # user]
...Xuri,  I don't think you can use functions in the load statement like that. To do something like that you'd need to write your own LoadFunc. As far as I can tell at a glance, and I have...
   Author: Jacob Perkins, 2013-07-30, 12:34
Re: comparing two files using pig - Pig - [mail # user]
...Now here's where it gets fun :)  First, I do want to show you that (given sufficient coffee) there is a set theoretic approach to your first question that allows you to solve it with ju...
   Author: Jacob Perkins, 2013-06-21, 13:38
Re: Spreading data in Pig - Pig - [mail # user]
...Hi John,  The only way I can think of to do this is using the RANK operator (available only in pig version 0.11) along with a custom udf as follows:  * RANK the users relation to r...
   Author: Jacob Perkins, 2013-03-31, 18:13
[PIG-2317] Ruby/Jruby UDFs - Pig - [issue]
...It should be possible to write UDFs in Ruby. These UDFs will be registered in the same way as python and javascript UDFs....
http://issues.apache.org/jira/browse/PIG-2317    Author: Jacob Perkins, 2012-04-03, 18:21
Pig (34)
Cassandra (1)
mail # user (33)
issue (1)
last 7 days (0)
last 30 days (1)
last 90 days (2)
last 6 months (3)
last 9 months (34)
Dmitriy Ryaboy (347)
Alan Gates (333)
Daniel Dai (315)
Cheolsoo Park (242)
Jonathan Coveney (237)
Russell Jurney (173)
Rohini Palaniswamy (134)
Bill Graham (132)
Olga Natkovich (129)
Prashant Kommireddi (106)
Julien Le Dem (84)
Aniket Mokashi (76)
Thejas Nair (69)
Thejas M Nair (63)
Mridul Muralidharan (61)
Jacob Perkins