Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> CLOB data not imported into HBase from Oracle


+
Michal Taborsky 2013-06-12, 21:11
+
Jarek Jarcec Cecho 2013-06-12, 23:27
+
Michal Taborsky 2013-06-13, 07:13
Copy link to this message
-
Re: CLOB data not imported into HBase from Oracle
Thank you for upgrading your Sqoop installation Michal! Would you mind trying to map the column from CLOB to string to see if that helps? Something like:

  sqoop import ... --map-column-java CLOBCOL=String

More information about type mapping can be find in our user guide:

http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping

Jarcec

On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the result
> is the same.
>
> Michal Taborsky
>
>
> 2013/6/13 Jarek Jarcec Cecho <[EMAIL PROTECTED]>
>
> > Hi Michal,
> > version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> > therefore I would strongly suggest you to upgrade to the latest release
> > that can be downloaded from [1].
> >
> > Jarcec
> >
> > Links:
> > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> >
> > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > Hello,
> > >
> > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > >
> > > I am trying to get data from Oracle 11gR2 to HBase. The import works, but
> > > CLOB columns are not making it into HBase.
> > >
> > > My simplest testcase:
> > >
> > > In Oracle:
> > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL
> > CLOB
> > > );
> > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > 'clobval');
> > >
> > > The sqoop command I run is following (the connect parameter is shortened,
> > > but works):
> > >
> > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > --hbase-table
> > > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1
> > >
> > > The job runs OK, the only surprising is the second to last line:
> > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
> > > 7.3188 seconds (0 bytes/sec)
> > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > >
> > > Anyway, after looking at the table in HBase:
> > >
> > > # hbase shell
> > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > >
> > > hbase(main):001:0> scan 'table1'
> > > ROW                            COLUMN+CELL
> > >  1                             column=d:STRCOL, timestamp=1371070804479,
> > > value=strval
> > > 1 row(s) in 0.6070 seconds
> > >
> > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > general,
> > > because when I import the same table into Hive or just text file, the
> > clob
> > > data is there. The problem exists only when importing into HBase. I tried
> > > searching Sqoop Jira and the internets at large, but could not find any
> > > mention of CLOBs not getting into HBase.
> > >
> > > Thank you for your help,
> > > Michal Taborsky
> >
+
Michal Taborsky 2013-06-13, 15:12
+
Jarek Jarcec Cecho 2013-06-13, 16:27
+
Michal Taborsky 2013-06-13, 18:46
+
Jarek Jarcec Cecho 2013-06-13, 18:50