Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Problems with 0.11, count(DISTINCT), and NPE


Copy link to this message
-
Re: Problems with 0.11, count(DISTINCT), and NPE
Based on the log, it may be also related to
https://issues.apache.org/jira/browse/HIVE-4927. To make it work (in a not
very optimized way), can you try "set
hive.auto.convert.join.noconditionaltask=false;" ? If you still get the
error, give "set hive.auto.convert.join=false;" a try (it will turn off map
join auto convert, so you will use reduce-side join).

Thanks,

Yin
On Tue, Sep 3, 2013 at 6:03 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:

> Not sure about EMR. Your best bet is to ask on EMR forums.
>
> Thanks,
> Ashutosh
>
>
> On Tue, Sep 3, 2013 at 2:18 PM, Nathanial Thelen <[EMAIL PROTECTED]>wrote:
>
>> Is there a way to run a patch on EMR?
>>
>> Thanks,
>> Nate
>>
>> On Sep 3, 2013, at 2:14 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>
>> wrote:
>>
>> Fix in very related area has been checked in trunk today :
>> https://issues.apache.org/jira/browse/HIVE-5129 Likely that will fix
>> your issue.
>> Can you try latest trunk?
>>
>> Ashutosh
>>
>>
>> On Tue, Sep 3, 2013 at 2:03 PM, Nathanial Thelen <[EMAIL PROTECTED]>wrote:
>>
>>> I am running Hive in EMR and since upgrading to 0.11 from 0.8.1.8 I have
>>> been getting NullPointerExceptions (NPE) for certain queries in our staging
>>> environment.  Only difference between stage and production is the amount of
>>> traffic we get so the data set is much smaller.  We are not using any
>>> custom code.
>>>
>>> I have greatly simplified the query down to the bare minimum that will
>>> cause the error:
>>>
>>> SELECT
>>>     count(DISTINCT ag.adGroupGuid) as groups,
>>>     count(DISTINCT av.adViewGuid) as ads,
>>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>>> FROM
>>>     adgroup ag
>>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid
>>>
>>> This will return the following before any Map Reduce jobs start:
>>>
>>> FAILED: NullPointerException null
>>>
>>> Looking in the hive log at /mnt/var/log/apps/hive_0110.log and scanning,
>>> I see this error:
>>>
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=29
>>> length: 94324 file count: 20 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=30 length:
>>> 142609 file count: 21 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=30
>>> length: 65519 file count: 21 directory count: 1
>>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities
>>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for
>>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=29 length:
>>> 205096 file count: 20 directory count: 1
>>> 2013-09-03 18:09:19,800 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where
>>> optimization is applicable
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 0 metadata only table
>>> scans
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where
>>> optimization is applicable
>>> 2013-09-03 18:09:19,801 INFO
>>>  org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer
>>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 1 metadata only table
>>> scans
>>> 2013-09-03 18:09:19,801 ERROR org.apache.hadoop.hive.ql.Driver
>>> (SessionState.java:printError(386)) - FAILED: NullPointerException null
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB