Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 71 to 80 from 871 (0.063s).
Loading phrases to help you
refine your search...
[expand - 1 more] - Re: Issues with SAMPLE in PIG v0.8.1 - Pig - [mail # user]
...I just ran this very script three times using Pig 0.8 (svn revision 1148107) on a set of 2.5 million rows and got (2509), (2552), and (2473) as the output.  Don't know what to tell you....
   Author: Dmitriy Ryaboy, 2012-09-17, 05:24
Re: access schema defined in LOAD statement in custom LoadFunc? - Pig - [mail # user]
...I am not sure why pushProjection doesn't solve your dilemma? This is what we use in HBaseStorage, and ElephantBird uses in thrift and protobuf loaders.  D  On Sun, Sep 16, 2012 at ...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:06
Re: Approaches to storing arbitrary schema in a sequencefile - Pig - [mail # user]
...We tend to write protobuf or thrift definition for complex objects, but that introduces severe latency into the development process. I suppose you could try something like kryo (and create a...
   Author: Dmitriy Ryaboy, 2012-09-16, 02:44
Re: Apache Pig slides from the - Pig - [mail # user]
...Wow, that's a fantastic presentation Adam! Nice job on all the examples and slides.  D  On Sat, Sep 15, 2012 at 3:16 AM, Adam Kawa  wrote:...
   Author: Dmitriy Ryaboy, 2012-09-15, 19:05
[expand - 2 more] - Re: Reading BytesWritable in sequence file - Pig - [mail # user]
...Install protocol buffers 2.3 and thrift 0.5   Protocol Buffer and Thrift compiler dependencies Elephant Bird requires Protocol Buffer compiler version 2.3 at build time, as generated cl...
   Author: Dmitriy Ryaboy, 2012-09-13, 20:24
Re: Batching transformations in Pig - Pig - [mail # user]
...Group, and pass the grouped sets to your batch-processing UDF?  so:  data: id1 bucket1 id2 bucket2 id3 bucket2 id4 bucket1  bucketized = group data by bucket_id;  bucket1...
   Author: Dmitriy Ryaboy, 2012-09-13, 09:48
Re: Modifying databag on the fly - Pig - [mail # dev]
...FYI -- we wound up going with a much cleaner and memory-friendly solution of returning a new databag implementation which simply proxied all the calls to the original bag, but returned a spe...
   Author: Dmitriy Ryaboy, 2012-09-08, 06:10
Re: Using LoadFunc to get arbitrary data into Pig script - Pig - [mail # user]
...Hi Thomas, This isn't a complete answer, but take a look at mock.Storage that Julien wrote to make testing easy:  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/mo...
   Author: Dmitriy Ryaboy, 2012-09-07, 16:56
Re: Machine Learning + Pig? - Pig - [mail # user]
...Please take a look at Alek and Jimmy's paper on ML in Pig; there are also a few presentations they did on this, here's one from the Hadoop Summit: https://speakerdeck.com/u/lintool/p/large-s...
   Author: Dmitriy Ryaboy, 2012-09-06, 23:01
[expand - 1 more] - Re: Current "patch available' and open issues - Pig - [mail # dev]
...+1  (we also need to train committers to actually review stuff.. guilty of not reviewing, myself..)  D  On Tue, Sep 4, 2012 at 10:59 AM, Alan Gates  wrote:...
   Author: Dmitriy Ryaboy, 2012-09-04, 18:42
Sort:
project
Pig (871)
Hadoop (9)
Drill (5)
MapReduce (3)
Bigtop (1)
HBase (1)
type
mail # user (693)
mail # dev (178)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (871)
author
Daniel Dai (405)
Dmitriy Ryaboy (345)
Alan Gates (333)
Cheolsoo Park (271)
Jonathan Coveney (230)
Rohini Palaniswamy (174)
Russell Jurney (173)
Olga Natkovich (131)
Bill Graham (130)
Prashant Kommireddi (110)
Julien Le Dem (81)
Aniket Mokashi (79)
Thejas Nair (70)
Thejas M Nair (64)
Mridul Muralidharan (61)
Ashutosh Chauhan (42)
pi song (41)
Gianmarco De Francisci Mo...(39)
Koji Noguchi (38)
liyunzhang_intel (37)
Pradeep Gollakota (36)
Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Jeff Zhang (32)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB