Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Improving query performance on hive and hdfs


+
iwannaplay games 2012-09-05, 06:19
Copy link to this message
-
Re: Improving query performance on hive and hdfs
You know that Hadoop is not designed for low latency. To say anything
useful I think you should share some more details:

- What query are you launching (does it have join/group by)
- How many mappers/reducers and jobs does the query spawn
- How does your data look like
- Also what version of Hadoop are you running, etc

Some things that are applicable depending on the things above
- Check if you can partition your data so that Hive can do partition pruning.
- If your query has joins then look at
https://cwiki.apache.org/Hive/languagemanual-joins.html (bottom of
page) to see how to organize your data to let Hive do a map side join.
- Try to play with the config option
mapreduce.job.reduce.slowstart.completedmaps, this can help you if you
have a lot of idle reducers in the map phase.
- I would try to limit the number of task per node to the number of
CPUs on the system, but I don't know if this is common practice.

On Wed, Sep 5, 2012 at 8:19 AM, iwannaplay games
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I ran a query on hive on top of 90 million records that took 12 minutes to
> execute and same query on sql server took 8 minutes.My question is how can i
> make hadoop's performance better.What all configurations will improve the
> latency?
>
> Thanks & Regards
> Prabhjot
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB