Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 25 (0.1s).
Loading phrases to help you
refine your search...
RE: Any way to join two aliases without using CROSS - Pig - [mail # user]
...Here is how to use rank and join for this problem:sh cat xxx1,2,3,4,51,2,4,5,71,5,7,8,9sh cat yyy10,1110,1210,13a= load 'xxx' using PigStorage(',');b= load 'yyy' using PigStorage(',');a2 = r...
   Author: william.dowling@..., 2014-03-25, 20:28
RE: Need example of python code with dependency files - Pig - [mail # user]
...You said "The .py code takes input from sys.stdin and outputs to sys.stdout" so I infer you are talking about streaming, not a python UDF. In that case, rather than streaming through your py...
   Author: william.dowling@..., 2013-11-06, 14:23
RE: ORDER BY a map value fails with a syntax error - pig bug? - Pig - [mail # user]
...http://pig.apache.org/docs/r0.12.0/basic.html#order-by says "Pig currently supports ordering on fields with simple types or by tuple designator (*). You cannot order on fields with complex ...
   Author: william.dowling@..., 2013-10-29, 18:41
RE: Converting xml to csv - Pig - [mail # user]
...This is one way to get employee_id and email:   A = load 'xxx.xml' using org.apache.pig.piggybank.storage.XMLLoader('employee') as (x:chararray);  B = foreach A generate REPLACE(x,...
   Author: william.dowling@..., 2013-09-17, 15:19
RE: can't parse the values using XML loader - Pig - [mail # user]
...Part of the problem might be that the regexp has  (.*)  but you need (.*)  Using regexps to parse XML is awfully brittle. An alternative is to use a UDF that calls out to an X...
   Author: william.dowling@..., 2013-08-21, 16:19
RE: fuzzy logic through pig programming - Pig - [mail # user]
...http://www.slideshare.net/Hadoop_Summit/pig-programming-is-fun (Daniel Dai and Thejas Nair) indicates how to use the nltk library from inside pig.  nltk has methods to compute various s...
   Author: william.dowling@..., 2013-06-27, 19:38
RE: missing error log - Pig - [mail # user]
...Thanks Johnny for your reply.  Working backwards: I am using MRv1.  I did try the logging suggestion you made, but did not get any other info.  So I did it the old-fashioned w...
   Author: william.dowling@..., 2013-03-25, 21:47
RE: Parsing XML using PIG - Pig - [mail # user]
...I just use XMLLoader to break the input xml into records, then stream that through an xml parser to pull out what I need into the fields of a relation for subsequent pig processing.  Li...
   Author: william.dowling@..., 2012-04-20, 14:16
RE: 0.9.1 out of memory problem - Pig - [mail # user]
...Nested DISTINCT is a killer. See   https://mail-archives.apache.org/mod_mbox/pig-user/201201.mbox/%[EMAIL PROTECTED]%3E  for a discussion of a simple workaround that worked for me....
   Author: william.dowling@..., 2012-01-18, 22:14
RE: Custom Loaders that use Input Streams for reading data? - Pig - [mail # user]
...I'm using org.apache.pig.piggybank.storage.XMLLoader from piggybank and that's working well for me.  I do something like this:   -- The analyze_src_recs.py script reads XML from st...
   Author: william.dowling@..., 2012-01-13, 15:11
Sort:
project
Pig (25)
MapReduce (1)
type
mail # user (25)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (25)
author
Daniel Dai (361)
Dmitriy Ryaboy (346)
Alan Gates (333)
Cheolsoo Park (291)
Jonathan Coveney (237)
Russell Jurney (174)
Rohini Palaniswamy (170)
Bill Graham (131)
Olga Natkovich (130)
Prashant Kommireddi (106)
Aniket Mokashi (87)
Julien Le Dem (84)
Thejas Nair (69)
Thejas M Nair (63)
Mridul Muralidharan (61)
Ashutosh Chauhan (41)
pi song (41)
Gianmarco De Francisci Mo...(38)
"Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Koji Noguchi (33)
Pradeep Gollakota (33)
Jeff Zhang (32)
Santhosh Srinivasan (29)
william.dowling@...
william.dowling@...