Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Pig vs hive performance


Copy link to this message
-
Re: Pig vs hive performance
from amazon web site:
http://aws.amazon.com/elasticmapreduce/faqs/#hive-8
Q: When should I use Hive vs. PIG?

Hive and PIG both provide high level data-processing languages with support
for complex data types for operating on large datasets. The Hive language
is a variant of SQL and so is more accessible to people already familiar
with SQL and relational databases. Hive has support for partitioned tables
which allow Amazon Elastic MapReduce job flows to pull down only the table
partition relevant to the query being executed rather than doing a full
table scan. Both PIG and Hive have query plan optimization. PIG is able to
optimize across an entire scripts while Hive queries are optimized at the
statement level.

Ultimately the choice of whether to use Hive or PIG will depend on the
exact requirements of the application domain and the preferences of the
implementers and those writing queries.
On Thu, Oct 4, 2012 at 7:52 AM, Abhishek <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> Can we discuss performance of pig vs hive
>
> 1) what hive is good at?
> 2) what pig is good at?
> 3) Hive optimizer vs pig optimizer
> 4) hive limitations vs pig limitations
>
> Regards
> Abhi
>
> Sent from my iPhone
>