Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Cannot join two accumulo tables together using accumulo-hive-storage-manager - null columnMapping not allowed


Copy link to this message
-
Cannot join two accumulo tables together using accumulo-hive-storage-manager - null columnMapping not allowed
Hi,

I've been using the accumulo-hive-storage-manager to query accumulo tables
from hive. This works great for simple SELECT statements, however, when I
try to JOIN two accumulo tables then I receive the error "null
columnMapping not allowed".

I have added further details on my table structure and the query I am
running below.

I'm not sure what I have done wrong here.

Thanks,
Andrew

ACCUMULO - This creates the Accumulo tables that I am using:

createtable statuses
insert bbc.co.uk status_code 200 2450
insert bbc.co.uk status_code 404 16
insert google.co.uk status_code 200 10345
insert google.co.uk status_code 404 10
createtable rankings
insert bbc.co.uk ranking alexa  45
insert google.co.uk ranking alexa  1

HIVE - I am creating the hive tables using the below:

CREATE EXTERNAL TABLE hivestatuses(rowid STRING, status200 STRING,
status404 STRING)
STORED BY 'org.apache.accumulo.storagehandler.AccumuloStorageHandler'
WITH SERDEPROPERTIES ('accumulo.columns.mapping' 'rowID,status_code|200,status_code|404', 'accumulo.table.name' 'statuses');

CREATE EXTERNAL TABLE hiverankings(rowid STRING, alexa_ranking STRING)
STORED BY 'org.apache.accumulo.storagehandler.AccumuloStorageHandler'
WITH SERDEPROPERTIES ('accumulo.columns.mapping' = 'rowID,ranking|alexa', '
accumulo.table.name' = 'rankings');

At this point I am able to run queries against either hive table with the
correct results being returned. However, I am unable to JOIN the two tables
using the below query, for example:

SELECT s.rowid, s.status200, r.alexa_ranking FROM hivestatuses s JOIN
hiverankings r ON s.rowid = r.rowid;
Error received:

2013-09-19 10:30:40     Starting to launch local task to process map join;
     maximum memory = 1065484288
java.io.IOException: java.lang.IllegalArgumentException: null columnMapping
not allowed.
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:544)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:342)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:300)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.IllegalArgumentException: null columnMapping not
allowed.
        at
org.apache.accumulo.storagehandler.AccumuloHiveUtils.parseColumnMapping(AccumuloHiveUtils.java:50)
        at
org.apache.accumulo.storagehandler.AccumuloHiveUtils.hiveColForRowID(AccumuloHiveUtils.java:60)
        at
org.apache.accumulo.storagehandler.predicate.AccumuloPredicateHandler.getIterators(AccumuloPredicateHandler.java:142)
        at
org.apache.accumulo.storagehandler.HiveAccumuloTableInputFormat.configure(HiveAccumuloTableInputFormat.java:269)
        at
org.apache.accumulo.storagehandler.HiveAccumuloTableInputFormat.getSplits(HiveAccumuloTableInputFormat.java:59)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:380)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
        ... 8 more
Execution failed with exit status: 2
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB