Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Fwd: sqoop and --direct option


Copy link to this message
-
Re: Fwd: sqoop and --direct option
Hi Thuy,
in case of direct import all your data are passed directly to mysqlimport utility that is then doing all the parsing and inserting into your database. Without direct mode Sqoop itself will parse data so that they can be inserted. Thus slight differences in edge error cases are expected as there are different tools involved to process the data.

Sqoop by default assumes that you're using string constant "null" to encode NULL values and everything else is assumed to be proper content of the column. However Hive is using different string constant "\N" to encode the NULL value. You need to tell Sqoop what is the correct substitution constant using --input-(non-)string parameters, like:

  sqoop export ... --input-null-string '\\N' --input-null-non-string '\\N'

Please note that the double backslash is needed due to way how this parameter is used withing Sqoop. Please check out Sqoop user guide if you have further questions [1].

Jarcec

Links:
1: http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_importing_data_into_hive

On Tue, Apr 02, 2013 at 10:34:52AM -0700, Thuy Nguyen wrote:
> Hi all,
>
> I got error exporting from hive to mysql (without --direct option) as below:
>
> java.lang.NumberFormatException: For input string: "\N"
>         at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>         at java.lang.Long.parseLong(Long.java:410)
>         at java.lang.Long.valueOf(Long.java:525)
>         at
> hive_category_event_type_facts.__loadFromFields(hive_category_event_type_facts.java:314)
>         at
> hive_category_event_type_facts.parse(hive_category_event_type_facts.java:251)
>         at
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:81)
>         at
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:40)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at
> com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.had
>
> and by running swoop with --direct option, '\N' is converted to 0 and thus,
> sqoop succeeded.
>
> Could you please advise if --direct option is expected to have that side
> effect?
>
> Thanks,
> Thuy