|
|
-
PK violation during Hive add partition
Karlen Lie 2012-12-08, 02:01
Hello,
We are running into intermittent errors while running the below query. Some background on this, our table (tbl_someTable) that we're altering is an external table, and the query below is run concurrently by multiple oozie workflows.
ALTER TABLE tbl_someTable ADD IF NOT EXISTS PARTITION(cluster_address = '${CLUSTERADDRESS}', upload_date = '${PREVIOUSDATE}' , upload_hour = '${PREVIOUSHOUR}') LOCATION 'asv://${RAWLOGSCONTAINER}/${CLUSTERADDRESS}/someLog/${PREVIOUSDATE}/${PREVIOUSHOUR}';
The errors we're getting are below.
Is this a known issue and is there a workaround for it?
Thanks karlen
stderr logs WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/c:/hdfs/mapred/local/taskTracker/distcache/5662320028645753518_889604055_1925270295/10.175.202.81/user/dssxuser/share/lib/hive/hive-common-0.9.0.jar!/hive-log4j.properties Hive history file=/tmp/dssxuser/hive_job_log_dssxuser_201212070113_1149932084.txt FAILED: Error in metadata: javax.jdo.JDODataStoreException: Insert of object "org.apache.hadoop.hive.metastore.model.MPartition@2a4e50f<mailto:org.apache.hadoop.hive.metastore.model.MPartition@2a4e50f>" using statement "INSERT INTO PARTITIONS (PART_ID,CREATE_TIME,SD_ID,PART_NAME,LAST_ACCESS_TIME,TBL_ID) VALUES (?,?,?,?,?,?)" failed : Violation of PRIMARY KEY constraint 'PK_partitions_PART_ID'. Cannot insert duplicate key in object 'dbo.PARTITIONS'. The duplicate key value is (221). NestedThrowables: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'PK_partitions_PART_ID'. Cannot insert duplicate key in object 'dbo.PARTITIONS'. The duplicate key value is (221). FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask Intercepting System.exit(9) Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [9] stderr logs WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/c:/hdfs/mapred/local/taskTracker/distcache/2751940372978647467_889604055_1925270295/10.175.202.81/user/dssxuser/share/lib/hive/hive-common-0.9.0.jar!/hive-log4j.properties Hive history file=/tmp/dssxuser/hive_job_log_dssxuser_201212071515_173032638.txt FAILED: Error in metadata: javax.jdo.JDODataStoreException: Insert of object "org.apache.hadoop.hive.metastore.model.MSerDeInfo@31ce40d5<mailto:org.apache.hadoop.hive.metastore.model.MSerDeInfo@31ce40d5>" using statement "INSERT INTO SERDES (SERDE_ID,SLIB,"NAME") VALUES (?,?,?)" failed : Violation of PRIMARY KEY constraint 'PK_serdes_SERDE_ID'. Cannot insert duplicate key in object 'dbo.SERDES'. The duplicate key value is (2006). NestedThrowables: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'PK_serdes_SERDE_ID'. Cannot insert duplicate key in object 'dbo.SERDES'. The duplicate key value is (2006). FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask Intercepting System.exit(9) Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [9]
+
Karlen Lie 2012-12-08, 02:01
-
Re: PK violation during Hive add partition
Ruslan Al-Fakikh 2012-12-10, 15:08
Hi!
Have you enabled Hive concurrency? Hive should not be accessed concurrently if the appropriate property is not enabled.
Ruslan
On Sat, Dec 8, 2012 at 6:01 AM, Karlen Lie <[EMAIL PROTECTED]> wrote:
> nal table, and the query below is run concurrently by multiple oo
+
Ruslan Al-Fakikh 2012-12-10, 15:08
-
Re: PK violation during Hive add partition
Edward Capriolo 2012-12-10, 15:18
This also could be an issue with datanucleas and m$ sql server. The project only officially supports derby and MySQL. Only tests using derby. Everything else is at your own risk.
On Mon, Dec 10, 2012 at 10:08 AM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:
> Hi! > > Have you enabled Hive concurrency? Hive should not be accessed > concurrently if the appropriate property is not enabled. > > Ruslan > > On Sat, Dec 8, 2012 at 6:01 AM, Karlen Lie <[EMAIL PROTECTED]> wrote: > >> nal table, and the query below is run concurrently by multiple oo > > >
+
Edward Capriolo 2012-12-10, 15:18
-
RE: PK violation during Hive add partition
Karlen Lie 2012-12-10, 17:12
Thanks! Looks like I've missed enabling the concurrency flag.
-karlen
From: Edward Capriolo [mailto:[EMAIL PROTECTED]] Sent: Monday, December 10, 2012 7:19 AM To: [EMAIL PROTECTED] Subject: Re: PK violation during Hive add partition
This also could be an issue with datanucleas and m$ sql server. The project only officially supports derby and MySQL. Only tests using derby. Everything else is at your own risk. On Mon, Dec 10, 2012 at 10:08 AM, Ruslan Al-Fakikh <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi!
Have you enabled Hive concurrency? Hive should not be accessed concurrently if the appropriate property is not enabled.
Ruslan On Sat, Dec 8, 2012 at 6:01 AM, Karlen Lie <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: nal table, and the query below is run concurrently by multiple oo
+
Karlen Lie 2012-12-10, 17:12
|
|