Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - sqoop merge question


Copy link to this message
-
Re: sqoop merge question
Chalcy 2012-10-23, 16:00
Hi Jarec,

If we are merging two hdfs data, I do not understand why we would need
database connection. Could you explain?

Thanks,
Chalcy
On Tue, Oct 23, 2012 at 10:59 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:

> Hi Chalcy,
> Sqoop needs to be able to parse the files you're trying to merge as newer
> entries must be updated. Usually Sqoop generate special class for this
> purpose based on connection in use, however in merge case there is no
> connection to the database and therefore you need to specify such class
> manually. This class is automatically generated for you in case of an
> import tool and might be manually generated using codegen tool [1]. You
> might get additional information about those two arguments in merge tool in
> our user guide [2].
>
> Jarcec
>
> Links:
> 1:
> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_codegen_literal
> 2:
> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_merge_literal
>
> On Tue, Oct 23, 2012 at 09:41:07AM -0400, Chalcy wrote:
> > Hello Sqoop users,
> >
> > I tried to use sqoop merge and understand all the parameters except
> > --class-name and --jar-file.  What should that be?  Sqoop errors out if I
> > do not specify them.
> >
> > The command I am using is
> > sqoop merge --new-data user/hadoop/testincrement --onto
> > /user/hadoop/exisitngdata --target-dir /user/hadoop/mergeddir --merge-key
> > rowid
> >
> > Thanks,
> > Chalcy
>