Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 91 to 100 from 1352 (0.567s).
Loading phrases to help you
refine your search...
Re: access schema defined in LOAD statement in custom LoadFunc? - Pig - [mail # user]
...I am not sure why pushProjection doesn't solve your dilemma? This is what we use in HBaseStorage, and ElephantBird uses in thrift and protobuf loaders.  D  On Sun, Sep 16, 2012 at ...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:06
Re: Issues with SAMPLE in PIG v0.8.1 - Pig - [mail # user]
...Brian, could you provide a complete script that reproduces the issue? What version of pig are you on?  Thanks, -D  On Sun, Sep 16, 2012 at 8:15 PM, Brian Choi  wrote:...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:02
Re: How can I split the data with more reducers? - Pig - [mail # user]
...Ok, then it's not POSplit that's holding the memory -- it does not participate in any of the reduce stages, according the the plan you attached.  To set parallelism, you can hardcode it...
   Author: Dmitriy Ryaboy, 2012-09-17, 05:01
Re: reuse same Tuple and ArrayList for every getNext call in LoadFunc? - Pig - [mail # user]
...I looked into this a while back -- trouble comes when something downstream from the loader tries to collect inputs into a bag, and doesn't do its own copies. One can easily argue that if som...
   Author: Dmitriy Ryaboy, 2012-09-17, 04:44
Re: How can I split the data with more reducers? - Pig - [mail # user]
...Still would like to see the script or the explain plan..  D  On Sat, Sep 15, 2012 at 7:50 PM, Haitao Yao  wrote: appers succeeded and the reducer failed. ined too much memory ...
   Author: Dmitriy Ryaboy, 2012-09-16, 08:41
Re: Approaches to storing arbitrary schema in a sequencefile - Pig - [mail # user]
...We tend to write protobuf or thrift definition for complex objects, but that introduces severe latency into the development process. I suppose you could try something like kryo (and create a...
   Author: Dmitriy Ryaboy, 2012-09-16, 02:44
Re: How can I split the data with more reducers? - Pig - [mail # user]
...That looks like a mapper, not a reducer. What's the script doing?  Dmitriy  On Sat, Sep 15, 2012 at 7:08 PM, Haitao Yao  wrote:  ...
   Author: Dmitriy Ryaboy, 2012-09-16, 02:39
Re: Apache Pig slides from the - Pig - [mail # user]
...Wow, that's a fantastic presentation Adam! Nice job on all the examples and slides.  D  On Sat, Sep 15, 2012 at 3:16 AM, Adam Kawa  wrote:...
   Author: Dmitriy Ryaboy, 2012-09-15, 19:05
Re: Reading BytesWritable in sequence file - Pig - [mail # user]
...Install protocol buffers 2.3 and thrift 0.5   Protocol Buffer and Thrift compiler dependencies Elephant Bird requires Protocol Buffer compiler version 2.3 at build time, as generated cl...
   Author: Dmitriy Ryaboy, 2012-09-13, 20:24
Re: Batching transformations in Pig - Pig - [mail # user]
...Group, and pass the grouped sets to your batch-processing UDF?  so:  data: id1 bucket1 id2 bucket2 id3 bucket2 id4 bucket1  bucketized = group data by bucket_id;  bucket1...
   Author: Dmitriy Ryaboy, 2012-09-13, 09:48
Sort:
project
Pig (1352)
Hadoop (14)
MapReduce (5)
HBase (3)
type
mail # user (1062)
mail # dev (290)
date
last 7 days (0)
last 30 days (0)
last 90 days (35)
last 6 months (51)
last 9 months (1352)
author
Dmitriy Ryaboy (1352)
Alan Gates (954)
Jonathan Coveney (731)
Daniel Dai (544)
Russell Jurney (485)
Olga Natkovich (453)
Prashant Kommireddi (367)
Bill Graham (334)
Cheolsoo Park (246)
Mridul Muralidharan (201)
Thejas Nair (195)
Ashutosh Chauhan (169)
Julien Le Dem (154)
Jeff Zhang (146)
Santhosh Srinivasan (142)