Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # user - CLOB data not imported into HBase from Oracle


+
Michal Taborsky 2013-06-12, 21:11
+
Jarek Jarcec Cecho 2013-06-12, 23:27
+
Michal Taborsky 2013-06-13, 07:13
+
Jarek Jarcec Cecho 2013-06-13, 14:45
Copy link to this message
-
Re: CLOB data not imported into HBase from Oracle
Michal Taborsky 2013-06-13, 15:12
Thanks Jarcec.

This fixed the issue and the data is coming into HBase. Sqoop could do this
automatically, I suppose, because for the Hive import it works.

But at least now I have a workaround. It works even in the 1.3.0 version,
by the way.

Thanks again,
Michal Taborsky
2013/6/13 Jarek Jarcec Cecho <[EMAIL PROTECTED]>

> Thank you for upgrading your Sqoop installation Michal! Would you mind
> trying to map the column from CLOB to string to see if that helps?
> Something like:
>
>   sqoop import ... --map-column-java CLOBCOL=String
>
> More information about type mapping can be find in our user guide:
>
>
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
>
> Jarcec
>
> On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> > Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the
> result
> > is the same.
> >
> > Michal Taborsky
> >
> >
> > 2013/6/13 Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> >
> > > Hi Michal,
> > > version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> > > therefore I would strongly suggest you to upgrade to the latest release
> > > that can be downloaded from [1].
> > >
> > > Jarcec
> > >
> > > Links:
> > > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> > >
> > > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > > Hello,
> > > >
> > > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > > >
> > > > I am trying to get data from Oracle 11gR2 to HBase. The import
> works, but
> > > > CLOB columns are not making it into HBase.
> > > >
> > > > My simplest testcase:
> > > >
> > > > In Oracle:
> > > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE),
> CLOBCOL
> > > CLOB
> > > > );
> > > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > > 'clobval');
> > > >
> > > > The sqoop command I run is following (the connect parameter is
> shortened,
> > > > but works):
> > > >
> > > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > > --hbase-table
> > > > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d
> -m 1
> > > >
> > > > The job runs OK, the only surprising is the second to last line:
> > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes
> in
> > > > 7.3188 seconds (0 bytes/sec)
> > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > > >
> > > > Anyway, after looking at the table in HBase:
> > > >
> > > > # hbase shell
> > > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > > >
> > > > hbase(main):001:0> scan 'table1'
> > > > ROW                            COLUMN+CELL
> > > >  1                             column=d:STRCOL,
> timestamp=1371070804479,
> > > > value=strval
> > > > 1 row(s) in 0.6070 seconds
> > > >
> > > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > > general,
> > > > because when I import the same table into Hive or just text file, the
> > > clob
> > > > data is there. The problem exists only when importing into HBase. I
> tried
> > > > searching Sqoop Jira and the internets at large, but could not find
> any
> > > > mention of CLOBs not getting into HBase.
> > > >
> > > > Thank you for your help,
> > > > Michal Taborsky
> > >
>
+
Jarek Jarcec Cecho 2013-06-13, 16:27
+
Michal Taborsky 2013-06-13, 18:46
+
Jarek Jarcec Cecho 2013-06-13, 18:50