Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Handling CLOBs in Sqoop - Hive Import


Copy link to this message
-
Re: Handling CLOBs in Sqoop - Hive Import
Hi Rahul,
sadly the parameter --hive-drop-import-delims is applicable only for String based types (such as CHAR, VARCHAR, NCHAR, ...), it's not applicable to CLOB. To workaround this, you can re-type the CLOB field into String using --map-column-java parameter, such as:

  sqoop import --map-column-java $columnName=String

Jarcec

On Wed, Nov 06, 2013 at 01:15:50PM -0500, Rahul Joshi wrote:
> Hi,
>
>
>
> We are trying to use Sqoop for importing data from Oracle. The table has
> CLOB as one of its column type which contains newline characters at many
> places. Tried using --hive-drop- import-delims option but somehow it’s not
> working. The data still contains newlines, and so Hive table doesn’t read
> them properly. Found that this works with SQL Server tables smoothly. The
> table / commands / sqoop options are more or less similar (except
> connection strings etc), not sure why it’s not working with Oracle. In case
> of import from Oracle,  the delims are not getting droped where for SQL
> Server, it modifies the data on HDFS.
>
>
>
> Sqoop also has a way to treat CLOB as external file (setting --inline-lob-limit
> to 0), wanted to know how this can be used along with Hive. Could import
> the data using this option, but the import fails if this option is used
> with –hive-import option. Is there any known way of dealing with such
> external CLOB data in Hive?
>
>
>
> Please let us know if anyone has any suggestions.
>
>
>
> Regards,
>
> --Rahul Joshi.
>
>
> On 6 November 2013 13:09, Rahul Joshi <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> >
> >
> > We are trying to use Sqoop for importing data from Oracle. The table has
> > CLOB as one of its column type which contains newline characters at many
> > places. Tried using --hive-drop- import-delims option but somehow it’s not
> > working. The data still contains newlines, and so Hive table doesn’t read
> > them properly. Found that this works with SQL Server tables smoothly. The
> > table / commands / sqoop options are more or less similar (except
> > connection strings etc), not sure why it’s not working with Oracle. In case
> > of import from Oracle,  the delims are not getting droped where for SQL
> > Server, it modifies the data on HDFS.
> >
> >
> >
> > Sqoop also has a way to treat CLOB as external file (setting --inline-lob-limit
> > to 0), wanted to know how this can be used along with Hive. Could import
> > the data using this option, but the import fails if this option is used
> > with –hive-import option. Is there any known way of dealing with such
> > external CLOB data in Hive?
> >
> >
> >
> > Please let us know if anyone has any suggestions.
> >
> >
> >
> > Regards,
> >
> > --Rahul Joshi.
> >
>
>
>
> --
> Regards,
> Rahul R. Joshi.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB