Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Cartesian product detection in the query plan?


Copy link to this message
-
Re: Cartesian product detection in the query plan?
On 28 Jan 2013, at 14:29, Edward Capriolo wrote:

> Iirc hive.mapred.mode strict should prevent this. If not we should add
> it.

hi Edward,

Yes, that's indeed what the book claims (quoting):

   hive> SELECT * FROM fracture_act JOIN fracture_ads
  > WHERE fracture_act.planner_id = fracture_ads.planner_id;
   FAILED: Error in semantic analysis: In strict mode, cartesian product
is not allowed. If you really want to perform the operation,
   +set hive.mapred.mode=nonstrict+

I am about to re-enable this setting on my cluster (after fixing all the
queries that it broke, especially all the ORDER BY ones :-) but I hoped
it was visible right there in the query plan, or in some other way. If
Hive can detect it, it should be visible somewhere, right?

Thanks!

david

>
> On Monday, January 28, 2013, David Morel <[EMAIL PROTECTED]> wrote:
>> Hi everyone,
>>
>> I had to kill some queries that were taking forever, and it turns out
>> they were doing cartesian products (missing ON clause on a JOIN).
>>
>> I wonder how I could see that in the EXPLAIN output (which I still
>> find
>> a bit cryptic). Specifically, the stage that it was stuck in was
>> this:
>>
>> Stage: Stage-7
>> Map Reduce
>> Alias -> Map Operator Tree:
>>   $INTNAME
>>       Reduce Output Operator
>>         sort order:
>>         tag: 1
>>         value expressions:
>>               expr: _col1
>>               type: int
>>   $INTNAME1
>>       Reduce Output Operator
>>         sort order:
>>         tag: 0
>>         value expressions:
>>               expr: _col0
>>               type: bigint
>>               expr: _col1
>>               type: string
>> Reduce Operator Tree:
>>   Join Operator
>>     condition map:
>>          Inner Join 0 to 1
>>     condition expressions:
>>       0 {VALUE._col0} {VALUE._col1}
>>       1 {VALUE._col1}
>>     handleSkewJoin: false
>>     outputColumnNames: _col0, _col1, _col3
>>     File Output Operator
>>       compressed: true
>>       GlobalTableId: 0
>>       table:
>>           input format:
> org.apache.hadoop.mapred.SequenceFileInputFormat
>>           output format:
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>>
>> Is there anything in there that should have alerted me?
>>
>> I found out by looking at the query, but I wonder if the query plan
>> (if
>> I could read it) would have given me that information.
>>
>> Thanks a lot
>>
>> David Morel
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB