Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 25 (0.139s).
Loading phrases to help you
refine your search...
RE: Problem in understanding UDF COUNT - Pig - [mail # user]
...This was hard for me to get when I started using pig, and it still annoys me after 1.5 year's experience with pig. In mathematics and logic, quantifiers (like "for each", "there exist") bind...
   Author: william.dowling@..., 2014-07-21, 15:03
RE: Need example of python code with dependency files - Pig - [mail # user]
...You said "The .py code takes input from sys.stdin and outputs to sys.stdout" so I infer you are talking about streaming, not a python UDF. In that case, rather than streaming through your py...
   Author: william.dowling@..., 2013-11-06, 14:23
RE: ORDER BY a map value fails with a syntax error - pig bug? - Pig - [mail # user]
...http://pig.apache.org/docs/r0.12.0/basic.html#order-by says "Pig currently supports ordering on fields with simple types or by tuple designator (*). You cannot order on fields with complex ...
   Author: william.dowling@..., 2013-10-29, 18:41
[expand - 2 more] - RE: Converting xml to csv - Pig - [mail # user]
...This is one way to get employee_id and email:   A = load 'xxx.xml' using org.apache.pig.piggybank.storage.XMLLoader('employee') as (x:chararray);  B = foreach A generate REPLACE(x,...
   Author: william.dowling@..., 2013-09-17, 15:19
RE: can't parse the values using XML loader - Pig - [mail # user]
...Part of the problem might be that the regexp has  (.*)  but you need (.*)  Using regexps to parse XML is awfully brittle. An alternative is to use a UDF that calls out to an X...
   Author: william.dowling@..., 2013-08-21, 16:19
RE: fuzzy logic through pig programming - Pig - [mail # user]
...http://www.slideshare.net/Hadoop_Summit/pig-programming-is-fun (Daniel Dai and Thejas Nair) indicates how to use the nltk library from inside pig.  nltk has methods to compute various s...
   Author: william.dowling@..., 2013-06-27, 19:38
[expand - 1 more] - RE: missing error log - Pig - [mail # user]
...Thanks Johnny for your reply.  Working backwards: I am using MRv1.  I did try the logging suggestion you made, but did not get any other info.  So I did it the old-fashioned w...
   Author: william.dowling@..., 2013-03-25, 21:47
RE: Parsing XML using PIG - Pig - [mail # user]
...I just use XMLLoader to break the input xml into records, then stream that through an xml parser to pull out what I need into the fields of a relation for subsequent pig processing.  Li...
   Author: william.dowling@..., 2012-04-20, 14:16
RE: 0.9.1 out of memory problem - Pig - [mail # user]
...Nested DISTINCT is a killer. See   https://mail-archives.apache.org/mod_mbox/pig-user/201201.mbox/%[EMAIL PROTECTED]%3E  for a discussion of a simple workaround that worked for me....
   Author: william.dowling@..., 2012-01-18, 22:14
RE: Custom Loaders that use Input Streams for reading data? - Pig - [mail # user]
...I'm using org.apache.pig.piggybank.storage.XMLLoader from piggybank and that's working well for me.  I do something like this:   -- The analyze_src_recs.py script reads XML from st...
   Author: william.dowling@..., 2012-01-13, 15:11
Sort:
project
Pig (25)
MapReduce (1)
type
mail # user (25)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (25)
author
Daniel Dai (409)
Dmitriy Ryaboy (345)
Alan Gates (333)
Cheolsoo Park (271)
Jonathan Coveney (230)
Rohini Palaniswamy (179)
Russell Jurney (174)
Olga Natkovich (131)
Bill Graham (130)
Prashant Kommireddi (110)
Julien Le Dem (81)
Aniket Mokashi (79)
Thejas Nair (70)
Thejas M Nair (64)
Mridul Muralidharan (61)
Ashutosh Chauhan (42)
pi song (41)
liyunzhang_intel (40)
Gianmarco De Francisci Mo...(39)
Koji Noguchi (38)
Pradeep Gollakota (36)
Cheolsoo Park (35)
Ruslan Al-Fakikh (35)
Dmitriy V. Ryaboy (34)
Jeff Zhang (32)
william.dowling@...
william.dowling@...
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB