Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Cannot join two accumulo tables together using accumulo-hive-storage-manager - null columnMapping not allowed


Copy link to this message
-
Cannot join two accumulo tables together using accumulo-hive-storage-manager - null columnMapping not allowed
Hi,

I've been using the accumulo-hive-storage-manager to query accumulo tables
from hive. This works great for simple SELECT statements, however, when I
try to JOIN two accumulo tables then I receive the error "null
columnMapping not allowed".

I have added further details on my table structure and the query I am
running below.

I'm not sure what I have done wrong here.

Thanks,
Andrew

ACCUMULO - This creates the Accumulo tables that I am using:

createtable statuses
insert bbc.co.uk status_code 200 2450
insert bbc.co.uk status_code 404 16
insert google.co.uk status_code 200 10345
insert google.co.uk status_code 404 10
createtable rankings
insert bbc.co.uk ranking alexa  45
insert google.co.uk ranking alexa  1

HIVE - I am creating the hive tables using the below:

CREATE EXTERNAL TABLE hivestatuses(rowid STRING, status200 STRING,
status404 STRING)
STORED BY 'org.apache.accumulo.storagehandler.AccumuloStorageHandler'
WITH SERDEPROPERTIES ('accumulo.columns.mapping' 'rowID,status_code|200,status_code|404', 'accumulo.table.name' 'statuses');

CREATE EXTERNAL TABLE hiverankings(rowid STRING, alexa_ranking STRING)
STORED BY 'org.apache.accumulo.storagehandler.AccumuloStorageHandler'
WITH SERDEPROPERTIES ('accumulo.columns.mapping' = 'rowID,ranking|alexa', '
accumulo.table.name' = 'rankings');

At this point I am able to run queries against either hive table with the
correct results being returned. However, I am unable to JOIN the two tables
using the below query, for example:

SELECT s.rowid, s.status200, r.alexa_ranking FROM hivestatuses s JOIN
hiverankings r ON s.rowid = r.rowid;
Error received:

2013-09-19 10:30:40     Starting to launch local task to process map join;
     maximum memory = 1065484288
java.io.IOException: java.lang.IllegalArgumentException: null columnMapping
not allowed.
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:544)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:342)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:300)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.IllegalArgumentException: null columnMapping not
allowed.
        at
org.apache.accumulo.storagehandler.AccumuloHiveUtils.parseColumnMapping(AccumuloHiveUtils.java:50)
        at
org.apache.accumulo.storagehandler.AccumuloHiveUtils.hiveColForRowID(AccumuloHiveUtils.java:60)
        at
org.apache.accumulo.storagehandler.predicate.AccumuloPredicateHandler.getIterators(AccumuloPredicateHandler.java:142)
        at
org.apache.accumulo.storagehandler.HiveAccumuloTableInputFormat.configure(HiveAccumuloTableInputFormat.java:269)
        at
org.apache.accumulo.storagehandler.HiveAccumuloTableInputFormat.getSplits(HiveAccumuloTableInputFormat.java:59)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:380)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
        ... 8 more
Execution failed with exit status: 2