|
|
-
Re: problems importing from oracle\Kathleen Ting 2012-03-12, 20:08
Bruce, FYI, this issue is now documented in the Sqoop Troubleshooting
Guide: https://issues.apache.org/jira/browse/SQOOP-461 Regards, Kathleen On Wed, Mar 7, 2012 at 4:28 PM, Arvind Prabhakar <[EMAIL PROTECTED]> wrote: > Hi Bruce, > > On Mon, Mar 5, 2012 at 6:50 PM, Bruce Bian <[EMAIL PROTECTED]> wrote: > > any feedback on this?Shall I create a jira issue on this? > > > > > > On Thu, Mar 1, 2012 at 11:12 PM, Bruce Bian <[EMAIL PROTECTED]> > wrote: > >> > >> Hi Jarcec, > >> After looking at the code of sqoop, my previous split-by problem is > caused > >> by stating '--driver "oracle.jdbc.OracleDriver" ' in the sqoop command, > and > >> the following code in org.apache.sqoop.manager.DefaultManagerFactory > >> initialized a GenericJdbcManager instead of OracleManager even after > >> --connection-manager OracleManager is specified > >> > >> SqoopOptions options = data.getSqoopOptions(); > >> String manualDriver = options.getDriverClassName(); > >> if (manualDriver != null) { > >> // User has manually specified JDBC implementation with --driver. > >> // Just use GenericJdbcManager. > >> return new GenericJdbcManager(manualDriver, options); > >> } > >> Any reason why the code above appears before initializing the connection > >> manager specified by the user?Shouldn't it be put even after the > connection > >> scheme is judged? > > This is by design of the current implementation. When you specify an > explicit driver, the builtin connection manager selection defaults to > the generic connection manager. This allows you to use Sqoop with > databases that have compliant JDBC drivers but are not directly > supported by Sqoop. > > Is there any particular reason why you must specify the --driver > option? By default the built in Oracle connection manager will chose > the very driver you are trying to pass in. > > Thanks, > Arvind > > > >> > >> > >> On Thu, Mar 1, 2012 at 5:17 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]> > >> wrote: > >>> > >>> Ignored :-) > >>> > >>> I do not believe that you're hitting exactly SQOOP-204. It seems that > >>> SQOOP-204 has failed during creating Input Splits for your job. But > you seem > >>> to be dying after your job is being executed on hadoop cluster. > >>> > >>> I'm afraid that I do not know how to help you at the moment. Would you > >>> mind upgrading on current 1.4.1 version? > >>> > >>> Jarcec > >>> > >>> On Thu, Mar 01, 2012 at 04:59:48PM +0800, Bruce Bian wrote: > >>> > Hi Jarek , > >>> > Please ignore my first problem for getting no hdfs results as it > turns > >>> > out > >>> > to be my silly mistake during copying of the query. sorry for the > >>> > annoyance. > >>> > The second problem of adding --split-by turns out to be SQOOP-204, > but > >>> > it > >>> > should already be fixed in 1.3.0 while i'm using 1.3.0-cdh3u3 or is > it? > >>> > > >>> > On Thu, Mar 1, 2012 at 4:10 PM, Bruce Bian <[EMAIL PROTECTED]> > >>> > wrote: > >>> > > >>> > > also when I'm adding the --split-by a.prod_inst_id to the sqoop > >>> > > command as > >>> > > in: > >>> > > QUERY="SELECT a.*, > >>> > > > >>> > > > b.acnt_no,b.addr_id,b.postcode,b.acnt_rmnd_tp,b.print_tp,b.media_type, > >>> > > c.cust_code,c.root_cust_code, > >>> > > > >>> > > > >>> > > > d.mdf_name,d.sub_bureau_code,d.bureau_cd,d.adm_sub_bureau_name,d.bureau_name > >>> > > FROM prc_idap_pi_root a > >>> > > LEFT OUTER JOIN prc_idap_pi_root_acnt b ON a.acnt_id=b.acnt_id > >>> > > LEFT OUTER JOIN prc_idap_pi_root_cust c ON a.cust_id=c.cust_id > >>> > > LEFT OUTER JOIN ocrm_vt_area d ON a.dev_area_id=d.area_id > >>> > > WHERE lst_upd_tmp >= (SELECT date_val - 1/240 FROM > >>> > > etl.etl_para_cfg_detail > >>> > > WHERE para_id=84) AND \$CONDITIONS" > >>> > > sqoop import \ > >>> > > --verbose \ > >>> > > --driver oracle.jdbc.OracleDriver \ > >>> > > --connect jdbc:oracle:thin:@10.239.47.36:1521/dx \ > >>> > > --username *** \ > >>> > > --password ****** \ > >>> > > --query "$QUERY" \ > >>> > > --split-by a.prod_inst_id \ |