thank you for explaining your use case. I do have an idea what you're trying to achieve now. I'm afraid that merge won't do the delete magic for you either. The hive queries seems as a reasonable solution to me.
On Tue, Oct 23, 2012 at 12:11:19PM -0400, Chalcy wrote:
> Hi Jarec,
> I split the questions into 2, actually trying to achieve one objective.
> The usecase is not to export back to db. For huge tables, do one time
> pull, then increment append based on modified date to a new table, merge
> both so I get the updated rows. I am using left outer join efficiently, but
> would like to try sqoop merge, if it is easy as to just give, input,
> incremented table and be able to merge.
> Also some rows would have been deleted in the database when we do the
> incremental update to the hive table. I should be able to delete the rows.
> The way I handle is to get all the ids(unique id) only from the database
> and do another outer join, so the database deleted rows will not be in the
> merged hive table.
> Thanks, Jarec,
> On Tue, Oct 23, 2012 at 11:02 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
> > Hi Chalcy,
> > I'm afraid that there isn't a way how to achieve deletes from withing
> > Sqoop.
> > Just a quick question. It seems to me that you're trying to import data to
> > HDFS, do some transformations and put the data back to your database (using
> > updates, inserts and deletes). In case that I do understand your use case
> > correctly, I would propose to truncate the table after your input and use
> > simple export to load updated data. I believe that such approach will be
> > faster than selective inserts, updates and deletes.
> > Jarcec
> > On Tue, Oct 23, 2012 at 09:44:04AM -0400, Chalcy wrote:
> > > Hello sqoop users,
> > >
> > > Sqoop incremental append for insert and update works really great. Is
> > > there anyway to handle deletes? I am planning to do it by left outer
> > join
> > > but trying to find if there is any other way.
> > >
> > > Thanks,
> > > Chalcy