Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Sqoop not picking up immediate changes


Copy link to this message
-
Re: Sqoop not picking up immediate changes
burberry blues 2013-11-15, 05:37
I go the solution. I have exclusively committed the changes in the oracle
database. The updates have been identified by the sqoop and written to the
filesystem location. I have used the Sqoop merge command which has replaced
the old file with the new file and removed the duplicates based on the
primary key.

But one question here the --split-by command in hive is accepting only 1
primary key. What if I ahve a combination of keys as primary keys.I am
getting error while giving multiple fields in the split-by parameter.
Please clarify
On Thu, Nov 14, 2013 at 1:13 AM, Anas Mosaad <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I'm not experienced with Sqoop but I'm trying to help. Is it possible to
> see the SQL statements executed by Sqoop. I believe if the statements are
> debugged anywhere, Blues will be able to pin point the issue.
>
>
> Best Regards
> Anas Mosaad
>
>
>
> From:        burberry blues <[EMAIL PROTECTED]>
> To:        [EMAIL PROTECTED],
> Date:        11/13/2013 07:02 PM
> Subject:        Re: Sqoop not picking up immediate changes
> ------------------------------
>
>
>
> HI Jarek,
> Intially in db i have
>
> col1 col2 col3
> 1        a    08-NOV-2013
> 2        b    08-NOV-2013
> 3        c    08-NOV-2013
>
> First  time sqoop import command
> ==========================> sqoop import --connect jdbc:oracle:thin:@//url:driver/database--username<username>
> --password <password> --table table1   --columns col1,col2,col3
> --incremental lastmodified --check-column col3 --last-value "2013-11-07
> 00.00.00.0" --split-by col1 --target-dir<outputdir>
>
> When i ran the above sqoop import i am able to successfully get all the 3
> records .
>
>
> Now i made 2 updates in DB
>
> col1 col2 col3
> 1        d    10-NOV-2013
> 2        e    10-NOV-2013
> 3        c    08-NOV-2013
>
> Second time Sqoop Command
> =======================> I read that sqoop is currently unable to merge the records of updates ,so
> i am trying to get the updates in a new directory and then use "sqoop
> merge" to merge this new one and the previous import output.
>
> So the command i ran is
>
> sqoop import --connect jdbc:oracle:thin:@//url:driver/database--username<username>
> --password <password> --table table1   --columns col1,col2,col3
> --incremental lastmodified --check-column col3 --last-value "2013-11-09
> 00.00.00.0" --split-by col1 --target-dir<outputdir1>
>
> This time accoring to the updates i should get  records with col1 values
> 1,2 as they are updated.
> But the second sqoop import zero records in output.(Even during the job
> execution it says map input reocrds or reduce output records as 0).
>
> Even the changes are happening in the DB(I checked the changes by running
> the selest * query in db) why cant sqoop find them.It seems like sqoop
> didnt find any updates from 9th nov .Please assist me in this issue.
>
> Thanks,
> Blues.
>
>
>
>
> On Wed, Nov 13, 2013 at 8:32 AM, Jarek Jarcec Cecho <*[EMAIL PROTECTED]*<[EMAIL PROTECTED]>>
> wrote:
> Hi Blues,
> would you mind sharing details about your use case? Table schemas, exact
> commands (both on database and in command line) and associated logs?
>
> Wild guess - when you are changing the rows in the database, are you
> committing the ongoing transaction? Sqoop will create a new connection with
> new transaction, so due to ACID it won't pick up any uncommitted changes.
>
> Jarcec
>
> On Tue, Nov 12, 2013 at 10:36:10PM -0800, burberry blues wrote:
> > Hi Team,
> >
> > I am having a problem with following scenario.
> >
> > In Db i update a column1 of a row and the column 2 got modified with
> > current timestamp.
> > But when i try to import those changes through sqoop using --incremental
> > lastmodified --check-column column2 --last-value <less than current
> > date>,it shows 0 records imported which are changed.
> >
> > There are changes in the DB but sqoop qorks as if it couldnt find the
> > updated once and still pointing to the old records.
> >
> > i.e Before updating i have 3 records with date as 10th Nov,i asked sqoop