Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 75 (0.067s).
Loading phrases to help you
refine your search...
Re: Pig script : Need help - Pig - [mail # user]
...That is because you're calling REPLACE on a bag of tuples and not a string.What you would want to do is write a UDF (suggested name JOIN_ON), thattakes as an argument a join char and will jo...
   Author: Pradeep Gollakota, 2014-04-07, 20:28
[expand - 2 more] - Re: 回复:Re: Any way to join two aliases without using CROSS - Pig - [mail # user]
...Unfortunately, the Enumerate UDF from DataFu would not work in this case.The UDF works on Bags and in this case, we want to enumerate a relation.Implementing RANK is a very tricky thing to d...
   Author: Pradeep Gollakota, 2014-03-26, 04:38
Re: Unable to add file paths when registering a UDF - Pig - [mail # user]
...According to the docs, It should work.http://pig.apache.org/docs/r0.12.0/basic.html#registerStupid question, but is the path correct? Is it on HDFS or local disk?On Tue, Mar 11, 2014 at 8:36...
   Author: Pradeep Gollakota, 2014-03-13, 03:43
[expand - 1 more] - Re: one MR job for group-bys and cube-bys - Pig - [mail # user]
...I forgot to mention that there are also other 3rd party libraries that makeexamining the physical plan easier. For example take a look atLipstickfrom Netflix.On Tue, Mar 11, 2014 at 11:41 AM...
   Author: Pradeep Gollakota, 2014-03-11, 18:49
[expand - 1 more] - Re: Nested foreach with order by - Pig - [mail # user]
...No... that wouldn't be related since you're not doing a GROUP ALL.The `FLATTEN(MY_UDF(t))` has me a little weary. Something is possibly goingwrong in your UDF. The output of your UDF is goin...
   Author: Pradeep Gollakota, 2014-02-28, 00:13
Re: how to control nested CROSS parallelism? - Pig - [mail # user]
...It's strange that it's being executed on the Map-side. The group is a reduce side operation (I'm assuming) and it seems that the nested foreach would happen on Reduce-side after grouping. Ha...
   Author: Pradeep Gollakota, 2014-01-20, 18:27
Re: Spilling issue - Optimize "GROUP BY" - Pig - [mail # user]
...Did you mean to say "timeout" instead of "spill"? Spills don't cause task failures (unless a spill fails). Default timeout for a task is 10 min. It would be very helpful to have a stack trac...
   Author: Pradeep Gollakota, 2014-01-10, 18:23
Re: Log File Versioning and Pig - Pig - [mail # user]
...It seems like what you're asking for is Versioned Schema management. Pig is not designed for that. Pig is only a scripting language to manipulate datasets.  I'd recommend you look into ...
   Author: Pradeep Gollakota, 2013-12-12, 23:35
Re: CROSS/Self-Join Bug - Please Help :( - Pig - [mail # user]
...I tried to following script (not exactly the same) and it worked correctly for me.  businesses = LOAD 'dataset' using PigStorage(',') AS (a, b, c, business_id: chararray, lat: double, l...
   Author: Pradeep Gollakota, 2013-12-04, 21:41
Re: Trouble with REGEX in PIG - Pig - [mail # user]
...It's not valid PigLatin...  The Grunt shell doesn't let you try out functions and UDFs are you're trying to use them.      A = LOAD 'data' USING PigStorage() as (ip: char...
   Author: Pradeep Gollakota, 2013-12-04, 18:28
Sort:
project
Pig (75)
HBase (16)
Kafka (8)
MapReduce (6)
Hadoop (4)
Ambari (2)
Avro (2)
HDFS (2)
Accumulo (1)
type
mail # user (70)
mail # dev (4)
issue (1)
date
last 7 days (0)
last 30 days (0)
last 90 days (1)
last 6 months (11)
last 9 months (75)
author
Daniel Dai (400)
Dmitriy Ryaboy (346)
Alan Gates (334)
Cheolsoo Park (310)
Jonathan Coveney (237)
Rohini Palaniswamy (187)
Russell Jurney (176)
Bill Graham (131)
Olga Natkovich (131)
Prashant Kommireddi (107)
Aniket Mokashi (87)
Julien Le Dem (84)
Thejas Nair (71)
Thejas M Nair (63)
Mridul Muralidharan (61)
Ashutosh Chauhan (42)
pi song (41)
Gianmarco De Francisci Mo...(39)
Koji Noguchi (38)
"Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Pradeep Gollakota (34)
Jeff Zhang (32)
Santhosh Srinivasan (29)