| clear query|facets|time |
Search criteria: .
Results from 1 to 10 from
81 (4.924s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Nb of reduce tasks when GROUPing - Pig - [mail # user]
|
|
...As Jonathan mentioned, TOP should obviate this particular use case. But for future examples, the parameters pig.exec.reducers.bytes.per.reducer and pig.exec.reducers.max might be usefu...
|
|
|
Author: Norbert Burger,
2013-05-21, 17:23
|
|
|
Re: Nb of reduce tasks when GROUPing - Pig - [mail # user]
|
|
...Take a look at the PARALLEL clause: http://pig.apache.org/docs/r0.7.0/cookbook.html#Use+the+PARALLEL+Clause On Fri, May 17, 2013 at 10:48 AM, Vincent Barat wrote: ...
|
|
|
Author: Norbert Burger,
2013-05-19, 13:37
|
|
|
Re: Ignore first record of a file - Pig - [mail # user]
|
|
...Perhaps the general way to do this is to write a custom loader, but for this simpler usecase, can you just filter out the record? FILTER ... BY $0 MATCHES '^[0-9]+' Norbert  ...
|
|
|
Author: Norbert Burger,
2013-03-14, 17:38
|
|
|
Re: removing dupes from a bag while saving first occurrence - Pig - [mail # user]
|
|
...Looking at your sample, it seems you have a GROUPBY generating these bags...? Could you just insert a DISTINCT before this GROUP BY? Norbert On Fri, Mar 8, 2013 at 5:00 PM,...
|
|
|
Author: Norbert Burger,
2013-03-08, 22:10
|
|
|
Re: too many memory spills - Pig - [mail # user]
|
|
...I thought Todd Lipcon's Hadoop Summit presentation [1] had some good info on this topic. [1] http://www.slideshare.net/cloudera/mr-perf Norbert On Thu, Mar 7, 2013 at 7:25 ...
|
|
|
Author: Norbert Burger,
2013-03-08, 02:47
|
|
|
HBaseStorage and setBatch() - Pig - [mail # user]
|
|
...We're using HBaseStorage to read some large rows (50k cols) and hitting some perf issues (responseTooSlow and responseTooLarge in RS logs). Looking through the code, I see that there's...
|
|
|
Author: Norbert Burger,
2013-03-07, 18:04
|
|
|
Re: Validating tuple length - Pig - [mail # user]
|
|
...FILTER SIZE(tuple) == 14 won't work for your use case? On Thu, Aug 30, 2012 at 3:39 PM, Sam William wrote: e length. I expect every record to have 14 fields, but some reco...
|
|
|
Author: Norbert Burger,
2012-08-30, 20:19
|
|
|
Re: DATA not storing as comma-separted - Pig - [mail # user]
|
|
...Yogesh -- based your log info you provided, it seems like your input data is not tab-delimited, which is the default delimiter when using PigStorage. As a result, your 3 space-separate...
|
|
|
Author: Norbert Burger,
2012-07-26, 02:35
|
|
|
Re: Import libraries in Jython UDFs - Pig - [mail # user]
|
|
...Have you registered the JAR in your Pig script (for local mode) and also added it to PIG_CLASSPATH (for remote mode, to get it into the distributed cache)? Norbert On Mon, Jul 23...
|
|
|
Author: Norbert Burger,
2012-07-24, 01:02
|
|
|
Re: FLATTEN() behavior difference in 0.8.1 and 0.10.0 ? - Pig - [mail # user]
|
|
...Yang -- I think you'll get the representation you're looking for by applying the FLATTEN a second time. Each instance of a FLATTEN strips off a single layer. Norbert On Sun...
|
|
|
Author: Norbert Burger,
2012-06-25, 10:55
|
|
|
|