Anecdotally I can say that Pig seems to scale down better than Hive.
We see this in tests- hive scripts running small amounts of data take
much longer than similar Pig scripts. Hive parallel settings are
enabled. I think this has to do with the fact that there doesn't seem
to be a 'local' mode for hive- you have to run it as mapreduce jobs
(either embedded or on a cluster). Please correct me if I am wrong
On Wed, Oct 3, 2012 at 3:52 PM, Abhishek <[EMAIL PROTECTED]> wrote:
> Hi all,
> Can we discuss performance of pig vs hive
> 1) what hive is good at?
> 2) what pig is good at?
> 3) Hive optimizer vs pig optimizer
> 4) hive limitations vs pig limitations
> Sent from my iPhone
Dan Richelson, Software Engineer
2560 55th St. | Boulder, Colorado 80301
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender.
Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company.
Finally, the recipient should check this email and any attachments for the presence of viruses.
The company accepts no liability for any damage caused by any virus transmitted by this email.