Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Using Sqoop incremental import as chunk


Copy link to this message
-
RE: Using Sqoop incremental import as chunk
Thanks a lot Felix & Jarcec. So it looks like, if I am running a Oozie coordinator job which periodically imports chunk data through Sqoop, before calling the Sqoop action I need to change the boundary query value every time. Like
--boundary-query 'select 1,20' - for the 1st run--boundary-query 'select 21,40' - for the 2nd run
Please correct me if I'm wrong. Thanks again.

> Date: Wed, 8 May 2013 11:08:05 -0700
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: Using Sqoop incremental import as chunk
>
> Hi Tanzir,
> incremental import is not working in chunks, it always imports everything since last import - e.g. everything from --last-value up. You can simulate the chunks if needed using --boundary-query argument as was advised by Felix.
>
> Jarcec
>
> On Wed, May 08, 2013 at 01:46:47PM -0400, Felix GV wrote:
> > --boundary-query
> >
> > http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_connecting_to_a_database_server
> >
> > --
> > Felix
> >
> >
> > On Wed, May 8, 2013 at 1:00 PM, Tanzir Musabbir <[EMAIL PROTECTED]>wrote:
> >
> > >  Hello everyone,
> > >
> > > Is it really possible to import chunk-wise data through sqoop incremental
> > > import?
> > >
> > > Say I have a table with id 1,2,3..... N (here N is 100) and now I want to
> > > import it as chunk. Like
> > > 1st import: 1,2,3.... 20
> > > 2nd import: 21,22,23.....40
> > > last import: 81,82,83....100
> > >
> > > I have read about the Sqoop job with incremental import and also know the
> > > --last-value parameter but do not know how to pass the chunk size. For the
> > > above example, chunk size here is 20.
> > >
> > >
> > > Any information will be highly appreciated. Thanks in advance.
> > >