Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive query started map task being killed during execution


Copy link to this message
-
Re: Hive query started map task being killed during execution
Dileep Kumar 2013-03-08, 22:31
Thanks for your attention !
No only one hive process is running and thing that bother me is smaller
query runs till completion which I invoke the same way. It is using embeded
db if that is the problem I can change it to external DB but as my smaller
query runs fine I thought this should be OK.
On Fri, Mar 8, 2013 at 2:16 PM, Dean Wampler <
[EMAIL PROTECTED]> wrote:

> Do you have more than one hive process running? It looks like you're using
> Derby, which only supports one process at a time. Also, you have to start
> Hive from the same directory every time, where the metastore "database" is
> written, unless you edit the JDBC connection property in the Hive config
> file to point to a particular path. Here's what I use:
>
> <property>
>   <name>javax.jdo.option.ConnectionURL</name>
>
> <value>jdbc:derby:;databaseName=/Users/somedufus/hive/metastore_db;create=true</value>
>   <description>JDBC connect string for a JDBC metastore</description>
> </property>
>
>
> On Fri, Mar 8, 2013 at 4:09 PM, Dileep Kumar <[EMAIL PROTECTED]>wrote:
>
>> Hi All,
>>
>> I am running a hive query which does insert into a table.
>> What I noticed from the symptom it looks like it got to do with some
>> settings but  I am not able to figure out what settings.
>>
>> When I submit the query it starts 2130 map tasks in the job and 150 of
>> them completes fine without any error and then next batch of 75 gets killed
>> and all of them after that gets killed.
>> While I submit a similar query based on smaller table its starts around
>> only 135 map tasks and it runs till completion without any error and does
>> the insert into appropriate table.
>>
>> I don't find any obvious error messages in any of the tasks log apart
>> form this:
>>
>>
>> ./hadoop-0.20-mapreduce/userlogs/job_201303080834_0001/attempt_201303080834_0001_m_001636_0/syslog:2013-03-08
>> 08:54:06,910 INFO orapache.hadoop.hive.ql.exec.MapOperator:
>> DESERIALIZE_ERRORS:0
>> ./hadoop-0.20-mapreduce/userlogs/job_201303080834_0001/attempt_201303080834_0001_m_001646_0/syslog:2013-03-08
>> 08:41:06,060 INFO orapache.hadoop.hive.ql.exec.MapOperator:
>> DESERIALIZE_ERRORS:0
>> ./hadoop-0.20-mapreduce/userlogs/job_201303080834_0001/attempt_201303080834_0001_m_001646_0/syslog:2013-03-08
>> 08:46:54,390 ERROR o.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher:
>> Error during instantiating JDBC driver org.apache.derby.jdbc.EmbeddedDriver.
>> ./hadoop-0.20-mapreduce/userlogs/job_201303080834_0001/attempt_201303080834_0001_m_001646_0/syslog:2013-03-08
>> 08:46:54,394 ERROR o.apache.hadoop.hive.ql.exec.FileSinkOperator:
>> StatsPublishing error: cannot connect to database
>>
>> Please suggest if I need to set anything in Hive when I invoke this
>> query. The query that runs successfully has lot less rows compared to on
>> that fails.
>>
>> Thanks,
>> DK
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>