Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Getting Slow Query Performance!


+
Gobinda Paul 2013-03-12, 09:09
+
bejoy_ks@... 2013-03-12, 11:52
+
bejoy_ks@... 2013-03-12, 11:54
Copy link to this message
-
RE: Getting Slow Query Performance!
Bennie Schut 2013-03-12, 09:40
Generally a single hadoop machine will perform worse then a single mysql machine. People normally use hadoop when they have so much data it won't really fit on a single machine and it would require specialized hardware (Stuff like SAN's) to run.
30GB of data really isn't that much and 2GB of ram is really not what hadoop is designed to work on. It really likes to have lots of memory.
I also don't see the hadoop configuration files so perhaps you only have 1 mapper and 1 reducer. But this is not a typical use-case so I doubt you'll see snappy performance after tweaking the configs.

From: Gobinda Paul [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 12, 2013 10:10 AM
To: [EMAIL PROTECTED]
Subject: Getting Slow Query Performance!

i use sqoop to import 30GB data ( two table employee(aprox 21 GB)  and salary(aprox 9GB ) into hadoop(Single Node) via hive.

i run a sample query like SELECT EMPLOYEE.ID,EMPLOYEE.NAME,EMPLOYEE.DEPT,SALARY.AMOUNT FROM EMPLOYEE JOIN SALARY WHERE EMPLOYEE.ID=SALARY.EMPLOYEE_ID AND SALARY.AMOUNT>900000;

In Hive it's take 15 Min(aprox.) where as mySQL take 4.5 min( aprox ) to execute that query .

CPU: Pentium(R) Dual-Core  CPU      E5700  @ 3.00GHz
RAM:  2GB
HDD: 500GB
Here IS My hive-site.xml conf.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
  </property>
  <property>
    <name>hive.hwi.listen.host</name>
     <value>0.0.0.0</value>
     <description>This is the host address the Hive Web Interface will listen on</description>
  </property>
  <property>
    <name>hive.hwi.listen.port</name>
    <value>9999</value>
    <description>This is the port the Hive Web Interface will listen on</description>
   </property>
   <property>
    <name>hive.hwi.war.file</name>
    <value>/lib/hive-hwi-0.9.0.war</value>
    <description>This is the WAR file with the jsp content for Hive Web Interface</description>
   </property>

  <property>
  <name>mapred.reduce.tasks</name>
    <value>-1</value>
            <description>The default number of reduce tasks per job.  Typically set
            to a prime close to the number of available hosts.  Ignored when
            mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as its default value.
            By setting this property to -1, Hive will automatically figure out what should be the number of reducers.
            </description>
   </property>

   <property>
     <name>hive.exec.reducers.bytes.per.reducer</name>
     <value>1000000000</value>
     <description>size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers.</description>
   </property>
  <property>
    <name>hive.exec.reducers.max</name>
    <value>999</value>
        <description>max number of reducers will be used. If the one
            specified in the configuration parameter mapred.reduce.tasks is
            negative, hive will use this one as the max number of reducers when
            automatically determine number of reducers.
            </description>
   </property>

  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive-${user.name}</value>
    <description>Scratch space for Hive jobs</description>
  </property>

   <property>
     <name>hive.metastore.local</name>
     <value>true</value>
   </property>

</configuration>
Any IDEA ??
+
Gobinda Paul 2013-03-12, 10:01
+
Bennie Schut 2013-03-12, 11:30