Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Sqoop import - column removed in the source DB and all the records are off by 1 column.


Copy link to this message
-
Re: Sqoop import - column removed in the source DB and all the records are off by 1 column.
Hi Jarcec,

I went with second option.

giving "select col1, col2, NULL as col3, col4" was throwing
NullPointereException.

So I went with

"Select col1,col2,'col3' as col3, col4"

Thanks
Anand

The second option --query looks good but is it support partition ?
On Sat, Dec 1, 2012 at 10:03 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:

> Hi Anandha,
> I do have couple of suggestions that could help in your scenario:
>
> 1) Rename your hive table to "TABLENAME_OLD" or something and let Sqoop
> create new hive table with proper schema. This way you will not loose any
> data nor appropriate schema, however you will end up with two tables.
>
> 2) Change your table based import to query based import and select one
> column extra:
>
> --query "select col1, col2, NULL as col3, col4, ..."
>
> Jarcec
>
> On Fri, Nov 30, 2012 at 06:15:02PM -0800, Anandha L Ranganathan wrote:
> > In my Sqoop import one of the column in the source table got deleted and
> > that resulting in data issue. This resulting   data are off by 1 column.
> > The removed column was in the middle of the schema. If it were last
> column
> > then wouldn't have any worries.
> >
> > Data is imported  from MySql to Hive using Sqoop.  I am using
>  sqoop-1.3.0
> >
> > Here is the syntax.
> >
> > sqoop import     --hive-import
> >                 --options-file 'credential.txt'
> >                 --table 'TABLENAME '
> >                 --where 'created between 1353960000000 and 1353963600000'
> >                 --hive-partition-key part
> >                 --hive-partition-value 'PARTITION_VALUE'
> >                 --hive-overwrite
> >                 --hive-delims-replacement
> >
> >
> >
> > Now the problem is One of the column in the source DB got removed.
> >
> > I tried with workaround by including  the --columns
> >
> >  1) By hardcoding third column with quotes.
> >         --columns "col1,col2,'col3' as col3,col4"
> >
> > but this gives error Column name 'col3' not in table
> >
> >
> > 2) Then i tried with (col2 repeated twice)
> >    --columns " col1,col2, col2 , col4"
> >
> > It threw an error
> >    Imported Failed: Duplicate Column identifier specified:
> >
> >
> > 3) Then i tried with (col2 as col3)
> >    --columns " col1,col2, col2 as col3, col4"
> >
> > ERROR tool.ImportTool: Imported Failed: Column name 'authid uuid' not in
> > table
> >
> > Could anybody suggest workaround for this.
> >
> > Thanks
>