Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # user - Sqoop import - column removed in the source DB and all the records are off by 1 column.


+
Anandha L Ranganathan 2012-12-01, 02:15
Copy link to this message
-
Re: Sqoop import - column removed in the source DB and all the records are off by 1 column.
Jarek Jarcec Cecho 2012-12-01, 18:03
Hi Anandha,
I do have couple of suggestions that could help in your scenario:

1) Rename your hive table to "TABLENAME_OLD" or something and let Sqoop create new hive table with proper schema. This way you will not loose any data nor appropriate schema, however you will end up with two tables.

2) Change your table based import to query based import and select one column extra:

--query "select col1, col2, NULL as col3, col4, ..."

Jarcec

On Fri, Nov 30, 2012 at 06:15:02PM -0800, Anandha L Ranganathan wrote:
> In my Sqoop import one of the column in the source table got deleted and
> that resulting in data issue. This resulting   data are off by 1 column.
> The removed column was in the middle of the schema. If it were last column
> then wouldn't have any worries.
>
> Data is imported  from MySql to Hive using Sqoop.  I am using  sqoop-1.3.0
>
> Here is the syntax.
>
> sqoop import     --hive-import
>                 --options-file 'credential.txt'
>                 --table 'TABLENAME '
>                 --where 'created between 1353960000000 and 1353963600000'
>                 --hive-partition-key part
>                 --hive-partition-value 'PARTITION_VALUE'
>                 --hive-overwrite
>                 --hive-delims-replacement
>
>
>
> Now the problem is One of the column in the source DB got removed.
>
> I tried with workaround by including  the --columns
>
>  1) By hardcoding third column with quotes.
>         --columns "col1,col2,'col3' as col3,col4"
>
> but this gives error Column name 'col3' not in table
>
>
> 2) Then i tried with (col2 repeated twice)
>    --columns " col1,col2, col2 , col4"
>
> It threw an error
>    Imported Failed: Duplicate Column identifier specified:
>
>
> 3) Then i tried with (col2 as col3)
>    --columns " col1,col2, col2 as col3, col4"
>
> ERROR tool.ImportTool: Imported Failed: Column name 'authid uuid' not in
> table
>
> Could anybody suggest workaround for this.
>
> Thanks
+
Anandha L Ranganathan 2012-12-02, 04:13
+
Jarek Jarcec Cecho 2012-12-02, 17:11
+
Anandha L Ranganathan 2012-12-02, 19:10
+
Jarek Jarcec Cecho 2012-12-02, 19:35