Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> sqoop merge question


Copy link to this message
-
Re: sqoop merge question
Hi Jarec,

If we are merging two hdfs data, I do not understand why we would need
database connection. Could you explain?

Thanks,
Chalcy
On Tue, Oct 23, 2012 at 10:59 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:

> Hi Chalcy,
> Sqoop needs to be able to parse the files you're trying to merge as newer
> entries must be updated. Usually Sqoop generate special class for this
> purpose based on connection in use, however in merge case there is no
> connection to the database and therefore you need to specify such class
> manually. This class is automatically generated for you in case of an
> import tool and might be manually generated using codegen tool [1]. You
> might get additional information about those two arguments in merge tool in
> our user guide [2].
>
> Jarcec
>
> Links:
> 1:
> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_codegen_literal
> 2:
> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_merge_literal
>
> On Tue, Oct 23, 2012 at 09:41:07AM -0400, Chalcy wrote:
> > Hello Sqoop users,
> >
> > I tried to use sqoop merge and understand all the parameters except
> > --class-name and --jar-file.  What should that be?  Sqoop errors out if I
> > do not specify them.
> >
> > The command I am using is
> > sqoop merge --new-data user/hadoop/testincrement --onto
> > /user/hadoop/exisitngdata --target-dir /user/hadoop/mergeddir --merge-key
> > rowid
> >
> > Thanks,
> > Chalcy
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB