Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 71 to 80 from 875 (0.05s).
Loading phrases to help you
refine your search...
Re: Using matches in generate clause? - Pig - [mail # user]
...With Pig 0.9 you can do this, though:  FOREACH html_pages GENERATE portal_id, (html matches 'some pattern' ? 1 : 0) as wp_match:int;    On Thu, Sep 27, 2012 at 10:38 AM, Alan ...
   Author: Dmitriy Ryaboy, 2012-09-27, 19:31
Re: How can I split the data with more reducers? - Pig - [mail # user]
...Neat pie chart! What produces this?  Trunk is not entirely stable right now, but it's stabilizing pretty rapidly (as long as you don't go using DateTime types and Cube operations.. don'...
   Author: Dmitriy Ryaboy, 2012-09-17, 09:07
Re: reuse same Tuple and ArrayList for every getNext call in LoadFunc? - Pig - [mail # user]
...Anything that builds a bag -- for example, I was just looking at the DefaultDataBag code (and by extension, DistinctDataBag, etc) and it does not do any tuple copies. We could, of course, ch...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:30
Re: Issues with SAMPLE in PIG v0.8.1 - Pig - [mail # user]
...I just ran this very script three times using Pig 0.8 (svn revision 1148107) on a set of 2.5 million rows and got (2509), (2552), and (2473) as the output.  Don't know what to tell you....
   Author: Dmitriy Ryaboy, 2012-09-17, 05:24
Re: access schema defined in LOAD statement in custom LoadFunc? - Pig - [mail # user]
...I am not sure why pushProjection doesn't solve your dilemma? This is what we use in HBaseStorage, and ElephantBird uses in thrift and protobuf loaders.  D  On Sun, Sep 16, 2012 at ...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:06
Re: Approaches to storing arbitrary schema in a sequencefile - Pig - [mail # user]
...We tend to write protobuf or thrift definition for complex objects, but that introduces severe latency into the development process. I suppose you could try something like kryo (and create a...
   Author: Dmitriy Ryaboy, 2012-09-16, 02:44
Re: Apache Pig slides from the - Pig - [mail # user]
...Wow, that's a fantastic presentation Adam! Nice job on all the examples and slides.  D  On Sat, Sep 15, 2012 at 3:16 AM, Adam Kawa  wrote:...
   Author: Dmitriy Ryaboy, 2012-09-15, 19:05
Re: Reading BytesWritable in sequence file - Pig - [mail # user]
...Install protocol buffers 2.3 and thrift 0.5   Protocol Buffer and Thrift compiler dependencies Elephant Bird requires Protocol Buffer compiler version 2.3 at build time, as generated cl...
   Author: Dmitriy Ryaboy, 2012-09-13, 20:24
Re: Batching transformations in Pig - Pig - [mail # user]
...Group, and pass the grouped sets to your batch-processing UDF?  so:  data: id1 bucket1 id2 bucket2 id3 bucket2 id4 bucket1  bucketized = group data by bucket_id;  bucket1...
   Author: Dmitriy Ryaboy, 2012-09-13, 09:48
Re: Modifying databag on the fly - Pig - [mail # dev]
...FYI -- we wound up going with a much cleaner and memory-friendly solution of returning a new databag implementation which simply proxied all the calls to the original bag, but returned a spe...
   Author: Dmitriy Ryaboy, 2012-09-08, 06:10
Pig (875)
Hadoop (9)
Drill (5)
MapReduce (3)
Bigtop (1)
HBase (1)
mail # user (693)
mail # dev (182)
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (3)
last 9 months (875)
Daniel Dai (361)
Dmitriy Ryaboy (346)
Alan Gates (333)
Cheolsoo Park (291)
Jonathan Coveney (237)
Rohini Palaniswamy (174)
Russell Jurney (174)
Bill Graham (131)
Olga Natkovich (130)
Prashant Kommireddi (107)
Aniket Mokashi (87)
Julien Le Dem (84)
Thejas Nair (69)
Thejas M Nair (63)
Mridul Muralidharan (61)
Ashutosh Chauhan (41)
pi song (41)
Gianmarco De Francisci Mo...(39)
"Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Koji Noguchi (34)
Pradeep Gollakota (33)
Jeff Zhang (32)
Santhosh Srinivasan (29)