Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Another sqoop incremental update question


+
Chalcy 2012-10-23, 13:44
+
Jarek Jarcec Cecho 2012-10-23, 15:02
+
Chalcy 2012-10-23, 16:11
Copy link to this message
-
Re: Another sqoop incremental update question
Hi Chalcy,
thank you for explaining your use case. I do have an idea what you're trying to achieve now. I'm afraid that merge won't do the delete magic for you either. The hive queries seems as a reasonable solution to me.

Jarcec

On Tue, Oct 23, 2012 at 12:11:19PM -0400, Chalcy wrote:
> Hi Jarec,
>
> I split the questions into 2, actually trying to achieve one objective.
>
> The usecase is not to export back to db.  For huge tables, do one time
> pull, then increment append based on modified date to a new table, merge
> both so I get the updated rows. I am using left outer join efficiently, but
> would like to try sqoop merge, if it is easy as to just give, input,
> incremented table and be able to merge.
>
> Also some rows would have been deleted in the database when we do the
> incremental update to the hive table.  I should be able to delete the rows.
>  The way I handle is to get all the ids(unique id) only from the database
> and do another outer join, so the database deleted rows will not be in the
> merged hive table.
>
> Thanks, Jarec,
> Chalcy
>
>
> On Tue, Oct 23, 2012 at 11:02 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
>
> > Hi Chalcy,
> > I'm afraid that there isn't a way how to achieve deletes from withing
> > Sqoop.
> >
> > Just a quick question. It seems to me that you're trying to import data to
> > HDFS, do some transformations and put the data back to your database (using
> > updates, inserts and deletes). In case that I do understand your use case
> > correctly, I would propose to truncate the table after your input and use
> > simple export to load updated data. I believe that such approach will be
> > faster than selective inserts, updates and deletes.
> >
> > Jarcec
> >
> > On Tue, Oct 23, 2012 at 09:44:04AM -0400, Chalcy wrote:
> > > Hello sqoop users,
> > >
> > > Sqoop incremental append for insert and update works really great.  Is
> > > there anyway to handle deletes?  I am planning to do it by left outer
> > join
> > > but trying to find if there is any other way.
> > >
> > > Thanks,
> > > Chalcy
> >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB