Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Strange error in Hive - Insert INTO


Copy link to this message
-
Re: Strange error in Hive - Insert INTO
Stephen Sprague 2013-08-14, 15:43
Hi Jerome,
That's a grandiose sql statement you got there! :)    I find that if you
break up those nested queries into simple CTAS (Create Table AS) statements
and create a cascading effect of referring to the table in the previous
step it makes debugging *so* much easier.  In other SQL dialects like DB2
this is facilitated by the WITH keyword. Maybe the Hive gurus will
implement that some day.   But that's a topic for another day.

So all that said, i see that the columns in your create table statement
don't match the columns in your outermost select statement.  In particular,
DT_JOUR is listed as the 6th column in your create table statement but it
appears to be the 2nd column in your select statement. So something looks
fishy there.

My guess is ultimately you're missing a comma somewhere in the select list
so hive is eating an column as a column alias and all your data is skewed
over by one column. This happens not so infrequently since it is valid sql.

Long winded answer to a simple question. Apologies up front!
On Wed, Aug 14, 2013 at 5:35 AM, Jérôme Verdier
<[EMAIL PROTECTED]>wrote:

> Hi everybody,
>
> I faced a strange error in Hive today.
>
> I have launch a hive script to make some calculations, joins, union,
> etc... and then insert these results in over hive table.
>
> Everything is working fine (.hql is working, full ok, data are imported),
> but one field (CO_RGRP_PRODUITS) is very strange.
>
> after the insert, CO_RGRP_PRODUITS is looking like a TIMESTAMP (1970-01-01
> 01:00:00) instead of being a simple STRING.
>
> I precise that source field are simple string like this  : 0101380,  for
> example
>
> What is going wrong here.
>
> You can find my script below (create table and .hql insert/calculations)
>
> Thanks for your help.
>
>
> INSERT SCRIPT :
> --THM_CA_RGRP_PRODUITS_JOUR
> CREATE TABLE default.THM_CA_RGRP_PRODUITS_JOUR (
>     CO_SOCIETE BIGINT,
>     TYPE_ENTITE STRING,
>     CODE_ENTITE STRING,
>     TYPE_RGRP_PRODUITS STRING,
>     CO_RGRP_PRODUITS STRING,
>     DT_JOUR TIMESTAMP,
>     MT_CA_NET_TTC FLOAT,
>     MT_OBJ_CA_NET_TTC FLOAT,
>     NB_CLIENTS FLOAT,
>     MT_CA_NET_TTC_COMP FLOAT,
>     MT_OBJ_CA_NET_TTC_COMP FLOAT,
>     NB_CLIENTS_COMP FLOAT);
>
> INSERT SCRIPT :
>
> INSERT INTO TABLE THM_CA_RGRP_PRODUITS_JOUR
>
>   SELECT
>           1                                                  as
> CO_SOCIETE,-- A modifier => variable
>           '2013-01-02 00:00:00.0'                                     as
> DT_JOUR, -- A modifier => variable
>           'MAG'                                                       as
> TYPE_ENTITE,
>           m.co_magasin                                                as
> CODE_ENTITE,
>           'FAM'                                                       as
> TYPE_RGRP_PRODUITS,
>           sourceunion.CO_RGRP_PRODUITS                                as
> CO_RGRP_PRODUITS,
>           SUM(MT_CA_NET_TTC)                                          as
> MT_CA_NET_TTC,
>           SUM(MT_OBJ_CA_NET_TTC)                                      as
> MT_OBJ_CA_NET_TTC,
>           SUM(NB_CLIENTS)                                             as
> NB_CLIENTS,
>           SUM(MT_CA_NET_TTC_COMP)                                     as
> MT_CA_NET_TTC_COMP,
>           SUM(MT_OBJ_CA_NET_TTC_COMP)                                 as
> MT_OBJ_CA_NET_TTC_COMP,
>           SUM(NB_CLIENTS_COMP)                                        as
> NB_CLIENTS_COMP
>
>         FROM (
>   SELECT
>             mtransf.id_mag_transfere             as
> ID_MAGASIN,
>             v.co_famille                         as CO_RGRP_PRODUITS,
>             sum(v.mt_ca_net_ttc)                 as MT_CA_NET_TTC,
>             0                                    as MT_OBJ_CA_NET_TTC,
>             0                                    as NB_CLIENTS,
>             sum(v.mt_ca_net_ttc * (CASE WHEN mtransf.flag_mag_comp = 'NC'
> THEN 0 ELSE 1 END))
>                                                  as MT_CA_NET_TTC_COMP,