Nitin kak 2013-02-04, 19:51
I'm afraid that Microsoft SQL Server connector do not support direct mode and thus the --direct parameter should not make any difference. Based on my personal experience the Microsoft's connector have comparable performance with the build-in one and thus I'm not expecting any change with regards to performance.
It seems that in your case you are using three threads (mappers) writing data simultaneously into database using JDBC interface. I would say that 1 hour for 2 GB data set is quite much. I would suggest to taking a look on SQL Server performance metrics during the export (IO, CPU, ...) to see where is the bottleneck.
On Mon, Feb 04, 2013 at 02:51:17PM -0500, Nitin kak wrote:
> Hi All,
> I am working on the sqoop export to SQL Server. It took like 1 hr to
> export 2 GB of data. Three mappers had spawned. I am using Default SQL
> Server Connector provided by cloudera(Couldn't use MS Sql Server Connector
> because of a bug in export). I was using --direct clause. Doesnt seem to
> have much effect. Any clues why its so slow? Would the performance improve
> once I start using MS SQLServerConnector?