Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Use RANK OVER PARTITION function in Hive 0.11


Copy link to this message
-
Re: Use RANK OVER PARTITION function in Hive 0.11
Hi,

The error is resolved.

I have 3 RANK() function in my script.

2 are working with arguments, but one is working without arguments.

I don't understand it, but the fact is it's working =)

Thanks everyone for your help.
2013/7/18 Jérôme Verdier <[EMAIL PROTECTED]>

> Hi,
>
> Since we saw that we have to give arguments in RANK() function, i'm trying
> to translate this one (working on Oracle 10g) to be functionnally in Hive :
>
>
> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER BY
> pnvente.dt_debut_commercial DESC,
> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>
> i try this :
>
> RANK(pnvente.dt_debut_commercial,
> COALESCE(pnvente.id_produit,dem.id_produit)) OVER (PARTITION BY
> mag.co_magasin, dem.id_produit ORDER BY pnvente.dt_debut_commercial DESC,
> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>
> and this :
>
>
> RANK(pnvente.dt_debut_commercial, pnvente.id_produit, dem.id_produit)
> OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER BY
> pnvente.dt_debut_commercial DESC,
> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>
> But Hive is giving me another error :
>
> FAILED: SemanticException Failed to breakup Windowing invocations into
> Groups. At least 1 group must only depend on input columns. Also check for
> circular dependencies.
> Underlying error: Ranking Functions can take no arguments
>
> i don't understand this error, in the first try, he said that he can't
> work without arguments, and now, rank function is falling because of the
> arguments.
>
> what is wrong now ?
>
>
>
> 2013/7/17 Richa Sharma <[EMAIL PROTECTED]>
>
>> my bad ... in relational databases we generally do not give a column name
>> inside rank() ... but the one in (partition by .... order by..) is
>> sufficient.
>>
>> But looks like that's not the case in Hive
>>
>>
>> Jerome,
>>
>> Please look at the examples in link below. See if you are able to make it
>> work
>>
>>
>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-PARTITIONBYwithpartitioning%2CORDERBY%2Candwindowspecification
>>
>>
>>
>> Cant help you beyond this as i don't have Hive 0.11 :-(
>>
>>
>> Richa
>>
>>
>> On Wed, Jul 17, 2013 at 3:08 PM, Jérôme Verdier <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi Richa,
>>>
>>> I have tried one query, with what i've understand of  Vijay's tips.
>>>
>>> SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY
>>> mag.co_societe ORDER BY  mag.me_vente_ht) AS rank FROM
>>> default.thm_renta_rgrp_produits_n_1 mag;
>>>
>>> This query is working, it gives me results.
>>>
>>> You say that maybe i'm hitting the same bug of JIRA HIVE-4663, but query
>>> is also failling when i put analytical columns in...
>>>
>>>
>>> 2013/7/17 Richa Sharma <[EMAIL PROTECTED]>
>>>
>>>> Vijay
>>>>
>>>> Jerome has already passed column -> mag.co_societe for rank.
>>>>
>>>> syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>>>> mag.me_vente_ht)
>>>> This will generate a rank for column mag.co_societe based on column
>>>> value me_vente_ht
>>>>
>>>> Jerome,
>>>>
>>>> Its possible you are also hitting the same bug as I mentioned in my
>>>> email before.
>>>>
>>>>
>>>> Richa
>>>>
>>>>
>>>> On Wed, Jul 17, 2013 at 2:31 PM, Vijay <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> As the error message states: "One ore more arguments are expected,"
>>>>> you have to pass a column to the rank function.
>>>>>
>>>>>
>>>>> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Hi Richa,
>>>>>>
>>>>>> I have tried a simple query without joins, etc....
>>>>>>
>>>>>> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>>>>>> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
>>>>>> default.thm_renta_rgrp_produits_n_1 mag;
>>>>>>
>>>>>> Unfortunately, the error is the same like previously.
>>>>>>
>>>>>> Error: Query returned non-zero code: 40000, cause: FAILED:
>

*Jérôme VERDIER*
06.72.19.17.31
[EMAIL PROTECTED]