I am trying to use pig 0.8's new features called flow allows custom Map-Reduce jobs. The book "Progrmming Pig" gives a quite simple example and it makes me puzzled. The example is below:
crawl = load 'webcrawl' as (url, pageid);
normalized = foreach crawl generate normalize(url);
goodurls = mapreduce 'blacklistchecker.jar'
store normalized into 'input'
load 'output' as (url, pageid)
`com.acmeweb.security.BlackListChecker -i input -o output`;
My mapreduce program needs three parametres , two are input path and the other is output path. My question is how can I pass it to the "mapreduce" command?
By the way, would you please give more details about the mapreduce command? There is little source about that.
Thanks very much!!
June 5th, 2013