Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Use RANK OVER PARTITION function in Hive 0.11


Copy link to this message
-
Re: Use RANK OVER PARTITION function in Hive 0.11
Jérôme Verdier 2013-07-17, 08:12
Hi Richa,

I have tried a simple query without joins, etc....

SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
default.thm_renta_rgrp_produits_n_1 mag;

Unfortunately, the error is the same like previously.

Error: Query returned non-zero code: 40000, cause: FAILED:
SemanticException Failed to breakup Windowing invocations into Groups. At
least 1 group must only depend on input columns. Also check for circular
dependencies.
Underlying error: org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException:
One or more arguments are expected.
SQLState:  42000
ErrorCode: 40000
2013/7/17 Richa Sharma <[EMAIL PROTECTED]>

> Jerome
>
> I would recommend that you try Rank function with columns from just one
> table first.
> Once it is established that rank is working fine then add all the joins.
>
> I am still on Hive 0.10 so cannot test it myself.
> However, I can find a similar issue on following link - so its possible
> you are facing issues due to this reported bug.
>
> https://issues.apache.org/jira/browse/HIVE-4663
>
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
> [EMAIL PROTECTED]> wrote:
>
>> You can see my query below :
>>
>> SELECT
>>             mag.co_magasin,
>>             dem.id_produit                                  as
>> id_produit_orig,
>>             pnvente.dt_debut_commercial                     as
>> dt_debut_commercial,
>>             COALESCE(pnvente.id_produit,dem.id_produit)     as id_produit,
>>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>> ORDER BY pnvente.dt_debut_commercial DESC,
>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>
>> FROM default.demarque_mag_jour dem
>>
>>           LEFT OUTER JOIN default.produit_norm pn
>>           ON  pn.co_societe = dem.co_societe
>>           AND pn.id_produit = dem.id_produit
>>           LEFT OUTER JOIN default.produit_norm pnvente
>>           ON  pnvente.co_societe = pn.co_societe
>>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>>           INNER JOIN default.kpi_magasin mag
>>           ON mag.id_magasin = dem.id_magasin
>>
>>
>> GROUP BY
>>             mag.co_magasin,
>>             dem.id_produit,
>>             pnvente.dt_debut_commercial,
>>             COALESCE(pnvente.id_produit,dem.id_produit);
>>
>>
>> 2013/7/16 Richa Sharma <[EMAIL PROTECTED]>
>>
>>> Can you share query with just RANK().
>>>
>>> Richa
>>>
>>>
>>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hi Richa,
>>>>
>>>> I tried to execute the rank function alone, but the result is the same
>>>>
>>>> Thanks
>>>>
>>>>
>>>> 2013/7/16 Richa Sharma <[EMAIL PROTECTED]>
>>>>
>>>>> Hi Jerome
>>>>>
>>>>>
>>>>> I think the problem is you are trying to use MIN, SUM and RANK
>>>>> function in a single query.
>>>>>
>>>>> Try to get the rank first in a query and on top of it apply these
>>>>> aggregate functions
>>>>>
>>>>> Richa
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a problem while using RANK OVER PARTITION function with Hive.
>>>>>>
>>>>>> Hive is in version 0.11 and, as we can see here :
>>>>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>>>>>> we can now use these functions in Hive.
>>>>>>
>>>>>> But, when i use it, i encountered this error :
>>>>>>
>>>>>> FAILED: SemanticException Failed to breakup Windowing invocations
>>>>>> into Groups. At least 1 group must only depend on input columns. Also check
>>>>>> for circular dependencies.
>>>>>> Underlying error:
>>>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>>>>> arguments are expected.
>>>>>>
>>>>>> Here is my script :
>>>>>>
>>>>>> SELECT
>>>>>>             mag.co_magasin,
>>>>>>             dem.id_produit                                  as
>>>>>> id_produit_orig,
*Jérôme VERDIER*
06.72.19.17.31
[EMAIL PROTECTED]