Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Semantics of Rank.


Copy link to this message
-
Re: Semantics of Rank.
Lefty Leverenz 2013-09-03, 08:23
Another email thread led me to
HIVE-5038<https://issues.apache.org/jira/browse/HIVE-5038>("rank
operator is case-sensitive and has odd semantics") -- it's resolved
as invalid, but is that only for the odd semantics?

Perhaps this issue is clarified in more recent emails.  I'm catching up on
a huge backlog.

-- Lefty
On Tue, Sep 3, 2013 at 4:03 AM, Lefty Leverenz <[EMAIL PROTECTED]>wrote:

> What's the answer -- does the "rank" keyword have to be lowercase?
>
> If lowercase is obligatory we need to revise the wiki, which shows all
> uppercase (
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
> ).
>
> In the test files it's lowercase (windowing_rank.q, ptf_negative_WhereWithRankCond.q).
>  The patch for HIVE-896 shows a lowercase name in GenericUDAFRank.java but
> I don't know if that means lowercase is required:
>
> @WindowFunctionDescription
>>
>> (
>>
>> description = @Description(
>>
>> name = "rank",
>>
>> value = "_FUNC_(x)"
>>
>> ),
>>
>> supportsWindow = false,
>>
>> pivotResult = true
>>
>> )
>>
>
>
> And what about the other keywords in the wikidoc?  Same lowercase
> requirement?
>
> -- Lefty
>
>
> On Fri, Jul 26, 2013 at 5:30 PM, saurabh <[EMAIL PROTECTED]> wrote:
>
>> Hi all,
>>
>> Below are some of observations based on the on-going rank function
>> discussion.
>>
>> 1. I executed below mentioned queries  and only the query with "rank"
>> (lowercase) executed successfully, rest were throwing exceptions "FAILED:
>> SemanticException Failed to breakup Windowing invocations into Groups."
>>
>> -  select cust_id, ord_dt, RANK() w from cust_ord window w as (partition
>> by cust_id order by ord_dt);
>>
>> -  select cust_id, ord_dt, Rank() w from cust_ord window w as (partition
>> by cust_id order by ord_dt);
>>
>> -   select cust_id, ord_dt, rank() w from cust_ord window w as (partition
>> by cust_id order by ord_dt);
>>
>> It seems "rank" keyword is case-sensitive. Attached is the screenshot
>> for reference.
>>
>> 2. I created a dummy table with the data provided in the below mail trail
>> and achieved the expected output, using the below mentioned query.
>>
>> *select cust_id, ord_dt, rank() over (partition by cust_id order by
>> ord_dt) from cust_ord;*
>>
>>  Request all to kindly review these details and suggest if it was of any
>> help!
>>
>> Thanks.
>>
>>
>> On Sat, Jul 27, 2013 at 12:07 AM, j.barrett Strausser <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Any further help on this, otherwise I'll file a jira.
>>>
>>>
>>> On Wed, Jul 24, 2013 at 11:32 PM, j.barrett Strausser <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> As an example : If I run my query above removing the arg the following
>>>> is thrown.
>>>>
>>>> FAILED: SemanticException Failed to breakup Windowing invocations into
>>>> Groups. At least 1 group must only depend on input columns. Also check for
>>>> circular dependencies.
>>>> Underlying error:
>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>>> arguments are expected.
>>>>
>>>>
>>>> Similar issue and fix here:
>>>>
>>>> http://www.marshut.com/rqvpz/use-rank-over-partition-function-in-hive-11.html
>>>>
>>>> Even if it didn't require an arg it still doesn't explain my anomalous
>>>> output.
>>>>
>>>>
>>>>
>>>> On Wed, Jul 24, 2013 at 11:28 PM, j.barrett Strausser <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> That isn't true. If you try to run the above HIVE without an argument,
>>>>> it will throw an exception. I have seen other users replicate this problem
>>>>> as well.
>>>>>
>>>>> I can file a JIRA if someone can confirm that my query should work.
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2013 at 11:02 PM, [EMAIL PROTECTED] <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Analytical function doesn't expect any argument. Rank() itself enough
>>>>>> to sequence based on the window you have defined in partition by. So
>>>>>>
>>>>>> Rank() over (partition by cmscustid  order by orderdate)
>>