Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Handling CLOBs in Sqoop - Hive Import


Copy link to this message
-
Re: Handling CLOBs in Sqoop - Hive Import
Hi Rahul,
sadly the parameter --hive-drop-import-delims is applicable only for String based types (such as CHAR, VARCHAR, NCHAR, ...), it's not applicable to CLOB. To workaround this, you can re-type the CLOB field into String using --map-column-java parameter, such as:

  sqoop import --map-column-java $columnName=String

Jarcec

On Wed, Nov 06, 2013 at 01:15:50PM -0500, Rahul Joshi wrote:
> Hi,
>
>
>
> We are trying to use Sqoop for importing data from Oracle. The table has
> CLOB as one of its column type which contains newline characters at many
> places. Tried using --hive-drop- import-delims option but somehow it’s not
> working. The data still contains newlines, and so Hive table doesn’t read
> them properly. Found that this works with SQL Server tables smoothly. The
> table / commands / sqoop options are more or less similar (except
> connection strings etc), not sure why it’s not working with Oracle. In case
> of import from Oracle,  the delims are not getting droped where for SQL
> Server, it modifies the data on HDFS.
>
>
>
> Sqoop also has a way to treat CLOB as external file (setting --inline-lob-limit
> to 0), wanted to know how this can be used along with Hive. Could import
> the data using this option, but the import fails if this option is used
> with –hive-import option. Is there any known way of dealing with such
> external CLOB data in Hive?
>
>
>
> Please let us know if anyone has any suggestions.
>
>
>
> Regards,
>
> --Rahul Joshi.
>
>
> On 6 November 2013 13:09, Rahul Joshi <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> >
> >
> > We are trying to use Sqoop for importing data from Oracle. The table has
> > CLOB as one of its column type which contains newline characters at many
> > places. Tried using --hive-drop- import-delims option but somehow it’s not
> > working. The data still contains newlines, and so Hive table doesn’t read
> > them properly. Found that this works with SQL Server tables smoothly. The
> > table / commands / sqoop options are more or less similar (except
> > connection strings etc), not sure why it’s not working with Oracle. In case
> > of import from Oracle,  the delims are not getting droped where for SQL
> > Server, it modifies the data on HDFS.
> >
> >
> >
> > Sqoop also has a way to treat CLOB as external file (setting --inline-lob-limit
> > to 0), wanted to know how this can be used along with Hive. Could import
> > the data using this option, but the import fails if this option is used
> > with –hive-import option. Is there any known way of dealing with such
> > external CLOB data in Hive?
> >
> >
> >
> > Please let us know if anyone has any suggestions.
> >
> >
> >
> > Regards,
> >
> > --Rahul Joshi.
> >
>
>
>
> --
> Regards,
> Rahul R. Joshi.