Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> CLOB data not imported into HBase from Oracle


Copy link to this message
-
CLOB data not imported into HBase from Oracle
Hello,

I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.

I am trying to get data from Oracle 11gR2 to HBase. The import works, but
CLOB columns are not making it into HBase.

My simplest testcase:

In Oracle:
CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB
);
INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
'clobval');

The sqoop command I run is following (the connect parameter is shortened,
but works):

sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table
table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1

The job runs OK, the only surprising is the second to last line:
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
7.3188 seconds (0 bytes/sec)
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.

Anyway, after looking at the table in HBase:

# hbase shell
Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012

hbase(main):001:0> scan 'table1'
ROW                            COLUMN+CELL
 1                             column=d:STRCOL, timestamp=1371070804479,
value=strval
1 row(s) in 0.6070 seconds

The CLOBCOL is not there. The CLOB handling in sqoop must work in general,
because when I import the same table into Hive or just text file, the clob
data is there. The problem exists only when importing into HBase. I tried
searching Sqoop Jira and the internets at large, but could not find any
mention of CLOBs not getting into HBase.

Thank you for your help,
Michal Taborsky
+
Jarek Jarcec Cecho 2013-06-12, 23:27
+
Michal Taborsky 2013-06-13, 07:13
+
Jarek Jarcec Cecho 2013-06-13, 14:45
+
Michal Taborsky 2013-06-13, 15:12
+
Jarek Jarcec Cecho 2013-06-13, 16:27
+
Michal Taborsky 2013-06-13, 18:46
+
Jarek Jarcec Cecho 2013-06-13, 18:50
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB