Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 31 to 40 from 64 (0.148s).
Loading phrases to help you
refine your search...
elephantbird JsonLoader doesn't like gz? - Pig - [mail # user]
...Hi,  Anyone using Twitter's elephantbird library? I was using its JsonLoader and got this error:  WARN  com.twitter.elephantbird.pig.load.JsonLoader - Could not json-decode st...
   Author: Dexin Wang, 2011-05-18, 18:12
Re: Can I pass an entire relation to a Pig UDF? - Pig - [mail # user]
...If the whole set is not that big, sorting in shell might be the easiest.  I' ve done that with result set of millions of records.    On Apr 26, 2011, at 8:49 PM, Arun A K &nbs...
   Author: Dexin Wang, 2011-04-27, 04:14
Re: implementing "if" logic - Pig - [mail # user]
...Here's a trick I used:  Together with $x, pass in another parameter $comment that's either '' (blank) when x>0 or '--' (double dashes) when x==0. Then  result = SOME OPERATION $...
   Author: Dexin Wang, 2011-03-27, 21:10
Re: reducer throttling? - Pig - [mail # user]
...Thanks for your explanation Alex.  In some cases, there isn't even a reduce phase. For example, we have some raw data, after our custom LOAD function and some filter function, it direct...
   Author: Dexin Wang, 2011-03-25, 01:18
Re: possibly Pig throttles the number of mappers - Pig - [mail # user]
...Thanks Alan!  We are using 0.79. Also got an answer from #hadoop channel and with this quora answer:  http://www.quora.com/Where-does-Hadoop-latency-come-from-e-g-it-takes-15-25-se...
   Author: Dexin Wang, 2011-03-24, 00:58
Re: possibly Pig throttles the number of mappers - Pig - [mail # user]
...And the nodes are pretty lightly loaded (~1.0) and there's plenty of free memory. Now I'm seeing 2 mappers per node. Very much under-utilized.  On Wed, Mar 23, 2011 at 1:39 PM, Dexin Wa...
   Author: Dexin Wang, 2011-03-24, 00:45
possibly Pig throttles the number of mappers - Pig - [mail # user]
...Hi,  We've seen a strange problem where some Pig jobs would just run fewer mappers concurrently than the mapper capacity. Specifically we have a 10 node cluster and each is configured t...
   Author: Dexin Wang, 2011-03-23, 20:39
Re: reducer throttling? - Pig - [mail # user]
...Can you describe a bit more about your bulk insert technique? And the way you control the number of reducers is also by adding artificial ORDER or GROUP step?  Thanks!  On Thu, Mar...
   Author: Dexin Wang, 2011-03-17, 21:00
reducer throttling? - Pig - [mail # user]
...We do some processing in hadoop then as the last step, we write the result to database. Database is not good at handling hundreds of concurrent connections and fast writes. So we need to thr...
   Author: Dexin Wang, 2011-03-17, 18:03
Re: STORE with variable? - Pig - [mail # user]
...Unfortunately, it doesn't work.  Seems the same problem as in https://issues.apache.org/jira/browse/PIG-1547  On Tue, Mar 8, 2011 at 1:22 PM, Dexin Wang  wrote:  ...
   Author: Dexin Wang, 2011-03-08, 22:04
Sort:
project
Pig (64)
type
mail # user (64)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (3)
last 9 months (64)
author
Dmitriy Ryaboy (1352)
Alan Gates (955)
Jonathan Coveney (731)
Daniel Dai (543)
Russell Jurney (485)
Olga Natkovich (453)
Prashant Kommireddi (367)
Bill Graham (334)
Cheolsoo Park (246)
Mridul Muralidharan (201)
Thejas Nair (195)
Ashutosh Chauhan (169)
Julien Le Dem (154)
Jeff Zhang (146)
Santhosh Srinivasan (142)
Dexin Wang