Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Problems with 0.11, count(DISTINCT), and NPE


Copy link to this message
-
Re: Problems with 0.11, count(DISTINCT), and NPE
Based on the log, it may be also related to
https://issues.apache.org/jira/browse/HIVE-4927. To make it work (in a not
very optimized way), can you try "set
hive.auto.convert.join.noconditionaltask=false;" ? If you still get the
error, give "set hive.auto.convert.join=false;" a try (it will turn off map
join auto convert, so you will use reduce-side join).

Thanks,

Yin
On Tue, Sep 3, 2013 at 6:03 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:

> Not sure about EMR. Your best bet is to ask on EMR forums.
>
> Thanks,
> Ashutosh
>
>
> On Tue, Sep 3, 2013 at 2:18 PM, Nathanial Thelen <[EMAIL PROTECTED]>wrote:
>
>> Is there a way to run a patch on EMR?
>>
>> Thanks,
>> Nate
>>
>> On Sep 3, 2013, at 2:14 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>
>> wrote:
>>
>> Fix in very related area has been checked in trunk today :
>> https://issues.apache.org/jira/browse/HIVE-5129 Likely that will fix
>> your issue.
>> Can you try latest trunk?
>>
>> Ashutosh
>>
>>
>> On Tue, Sep 3, 2013 at 2:03 PM, Nathanial Thelen <[EMAIL PROTECTED]>wrote:
>>
>>> I am running Hive in EMR and since upgrading to 0.11 from 0.8.1.8 I have
>>> been getting NullPointerExceptions (NPE) for certain queries in our staging
>>> environment.  Only difference between stage and production is the amount of
>>> traffic we get so the data set is much smaller.  We are not using any
>>> custom code.
>>>
>>> I have greatly simplified the query down to the bare minimum that will
>>> cause the error:
>>>
>>> SELECT
>>>     count(DISTINCT ag.adGroupGuid) as groups,
>>>     count(DISTINCT av.adViewGuid) as ads,
>>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>>> FROM
>>>     adgroup ag
>>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid
>>>
>>> This will return the following before any Map Reduce jobs start:
>>>
>>> FAILED: NullPointerException null
>>>
>>> Looking in the hive log at /mnt/var/log/apps/hive_0110.log and scanning,
>>> I see this error:
>>>
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=29
>>> length: 94324 file count: 20 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=30 length:
>>> 142609 file count: 21 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=30
>>> length: 65519 file count: 21 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=29 length:
>>> 205096 file count: 20 directory count: 1
>>> 2013-09-03 18:09:19,800 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where
>>> optimization is applicable
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 0 metadata only table
>>> scans
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where
>>> optimization is applicable
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 1 metadata only table
>>> scans
>>> 2013-09-03 18:09:19,801 ERROR org.apache.hadoop.hive.ql.Driver
>>> (SessionState.java:printError(386)) - FAILED: NullPointerException null