|
|
Corbin Hoenes 2010-04-15, 16:55
Hi there,
I've got a bunch of pig scripts that produce some output that I would like to assert based on some known good data. I've found that running the scripts in local mode (via bash scripts) on my local box is quite fast. Would this be a good testing mechanism or should I be running only under mapred mode?
Alan Gates 2010-04-21, 20:53
Local mode is reasonable for initial testing. However, since the execution is currently different the tests should be considered incomplete.
As of Pig 0.7 local mode will also run on Hadoop. This means the speed advantage will vanish, but it will be a more true test of your scripts.
Alan.
On Apr 15, 2010, at 9:55 AM, Corbin Hoenes wrote:
> Hi there, > > I've got a bunch of pig scripts that produce some output that I > would like to assert based on some known good data. I've found that > running the scripts in local mode (via bash scripts) on my local box > is quite fast. Would this be a good testing mechanism or should I > be running only under mapred mode? > > >
prasenjit mukherjee 2010-04-22, 02:43
isn't hadoop implies mapred mode ? What does it mean by running pig-local in hadoop.
On Wed, Apr 21, 2010 at 4:53 PM, Alan Gates <[EMAIL PROTECTED]> wrote: > Local mode is reasonable for initial testing. However, since the execution > is currently different the tests should be considered incomplete. > > As of Pig 0.7 local mode will also run on Hadoop. This means the speed > advantage will vanish, but it will be a more true test of your scripts. > > Alan. > > On Apr 15, 2010, at 9:55 AM, Corbin Hoenes wrote: > >> Hi there, >> >> I've got a bunch of pig scripts that produce some output that I would like >> to assert based on some known good data. I've found that running the >> scripts in local mode (via bash scripts) on my local box is quite fast. >> Would this be a good testing mechanism or should I be running only under >> mapred mode? >> >> >> > >
Corbin Hoenes 2010-04-22, 04:03
Thanks Alan for the reply. So does this mean one will have to setup pig and hadoop (manually) on the same box or will pig have all the hadoop dependencies built in (probably running hadoop pseudo-distributed mode)?
On Apr 21, 2010, at 2:53 PM, Alan Gates wrote:
> Local mode is reasonable for initial testing. However, since the execution is currently different the tests should be considered incomplete. > > As of Pig 0.7 local mode will also run on Hadoop. This means the speed advantage will vanish, but it will be a more true test of your scripts. > > Alan. > > On Apr 15, 2010, at 9:55 AM, Corbin Hoenes wrote: > >> Hi there, >> >> I've got a bunch of pig scripts that produce some output that I would like to assert based on some known good data. I've found that running the scripts in local mode (via bash scripts) on my local box is quite fast. Would this be a good testing mechanism or should I be running only under mapred mode? >> >> >> >
Jeff Zhang 2010-04-22, 07:54
pig local in hadoop means run mapreduce jobs in hadoop local mode. And the previous pig local mode means a different approach compared to mapreduce mode, It won't do data split and merge, it's just a pig's own implementation for the pig latin. And You do not need to do anything for pig local mode in hadoop.
On Thu, Apr 22, 2010 at 12:03 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > Thanks Alan for the reply. So does this mean one will have to setup pig and hadoop (manually) on the same box or will pig have all the hadoop dependencies built in (probably running hadoop pseudo-distributed mode)? > > On Apr 21, 2010, at 2:53 PM, Alan Gates wrote: > >> Local mode is reasonable for initial testing. However, since the execution is currently different the tests should be considered incomplete. >> >> As of Pig 0.7 local mode will also run on Hadoop. This means the speed advantage will vanish, but it will be a more true test of your scripts. >> >> Alan. >> >> On Apr 15, 2010, at 9:55 AM, Corbin Hoenes wrote: >> >>> Hi there, >>> >>> I've got a bunch of pig scripts that produce some output that I would like to assert based on some known good data. I've found that running the scripts in local mode (via bash scripts) on my local box is quite fast. Would this be a good testing mechanism or should I be running only under mapred mode? >>> >>> >>> >> > >
-- Best Regards
Jeff Zhang
|
|