Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Using Sqoop incremental import as chunk


Copy link to this message
-
Re: Using Sqoop incremental import as chunk
That's the only way I see you being able to achieve this, yes.

(Assuming you want many separate sequential imports, because if importing
the chunks in parallel is fine with you then you could use a single sqoop
command and let the size of your chunks be a by-product of the number of
mappers you choose.)

--
Felix
On Wed, May 8, 2013 at 2:17 PM, Tanzir Musabbir <[EMAIL PROTECTED]>wrote:

> Thanks a lot Felix & Jarcec. So it looks like, if I am running a Oozie
> coordinator job which periodically imports chunk data through Sqoop, before
> calling the Sqoop action I need to change the boundary query value every
> time. Like
>
> --boundary-query 'select 1,20' - for the 1st run
> --boundary-query 'select 21,40' - for the 2nd run
>
> Please correct me if I'm wrong. Thanks again.
>
>
> > Date: Wed, 8 May 2013 11:08:05 -0700
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> > Subject: Re: Using Sqoop incremental import as chunk
>
> >
> > Hi Tanzir,
> > incremental import is not working in chunks, it always imports
> everything since last import - e.g. everything from --last-value up. You
> can simulate the chunks if needed using --boundary-query argument as was
> advised by Felix.
> >
> > Jarcec
> >
> > On Wed, May 08, 2013 at 01:46:47PM -0400, Felix GV wrote:
> > > --boundary-query
> > >
> > >
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_connecting_to_a_database_server
> > >
> > > --
> > > Felix
> > >
> > >
> > > On Wed, May 8, 2013 at 1:00 PM, Tanzir Musabbir <[EMAIL PROTECTED]
> >wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > Is it really possible to import chunk-wise data through sqoop
> incremental
> > > > import?
> > > >
> > > > Say I have a table with id 1,2,3..... N (here N is 100) and now I
> want to
> > > > import it as chunk. Like
> > > > 1st import: 1,2,3.... 20
> > > > 2nd import: 21,22,23.....40
> > > > last import: 81,82,83....100
> > > >
> > > > I have read about the Sqoop job with incremental import and also
> know the
> > > > --last-value parameter but do not know how to pass the chunk size.
> For the
> > > > above example, chunk size here is 20.
> > > >
> > > >
> > > > Any information will be highly appreciated. Thanks in advance.
> > > >
>