Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Sqoop - utf-8 data load issue


+
varun kumar gullipalli 2013-07-16, 01:27
+
Jarek Jarcec Cecho 2013-07-16, 01:37
+
varun kumar gullipalli 2013-07-16, 01:52
+
Venkat Ranganathan 2013-07-16, 02:19
+
Jarek Jarcec Cecho 2013-07-16, 18:05
Copy link to this message
-
Re: Sqoop - utf-8 data load issue
Here is the output Jarcec...
 
 
________________________________
From: Jarek Jarcec Cecho <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; varun kumar gullipalli <[EMAIL PROTECTED]>
Sent: Tuesday, July 16, 2013 11:05 AM
Subject: Re: Sqoop - utf-8 data load issue
Thank you for the additional information Varun! Would you mind doing something like the following:

hadoop dfs -text THE_FILE  | hexdump -C

And sharing the output? I'm trying to see the actual content of the file rather than any interpreted value.

Jarcec

On Mon, Jul 15, 2013 at 06:52:11PM -0700, varun kumar gullipalli wrote:
> Hi Jarcec,
>
> I am validating the data by running the following command,
>
> hadoop fs -text <hdfs cluster>
>
> I think there is no issue with the shell (correct me if am wrong) because I am connecting to MySQL database from the same shell(command line) and  could view the source data properly.
>
> Initially we observed that the following conf files doesn't have utf-8 encoding. 
> <?xml version="1.0" encoding="UTF-8"?>
>
> sqoop-site.xml
> sqoop=site-template.xml
>
> But no luck after making the changes too.
>
> Thanks,
> Varun
>
>
> ________________________________
>  From: Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; varun kumar gullipalli <[EMAIL PROTECTED]>
> Sent: Monday, July 15, 2013 6:37 PM
> Subject: Re: Sqoop - utf-8 data load issue

>
> Hi Varun,
> we are usually not seeing any issues with transferring text data in UTF. How are
> you validating the imported file? I can imagine that your shell might be messing
> the encoding.
>
> Jarcec
>
> On Mon, Jul 15, 2013 at 06:27:25PM -0700, varun kumar gullipalli wrote:
> >
> >
> > Hi,
> > I am importing data from MySql to HDFS using free-form query import.
> > It works fine but facing issue when the data is utf-8.The source(MySql) db is utf-8 compatible but looks like sqoop is converting the data during import.
> > Example - The source value - elémeñt is loaded as elémeñt to HDFS.
> > Please provide a solution for this.
> > Thanks in advance!
00000000  31 32 33 34 35 36 37 38  39 30 07 31 33 37 33 32  |1234567890.13732|
00000010  36 30 33 34 36 31 35 31  07 31 33 37 33 32 36 30  |60346151.1373260|
00000020  33 34 36 31 35 31 07 30  07 65 6c c3 83 c2 a9 6d  |346151.0.el....m|
00000030  65 c3 83 c2 b1 74 07 c3  a8 c2 b4 c2 bc c3 a2 e2  |e....t..........|
00000040  80 9a c2 ac c3 ac e2 80  9a c2 ac c3 ac e2 80 93  |................|
00000050  c2 b4 c3 a8 e2 80 b0 c2  be c3 a8 c2 a5 c2 bf 0a  |................|
00000060
+
Jarek Jarcec Cecho 2013-07-17, 15:36
+
varun kumar gullipalli 2013-07-18, 00:42
+
sumit ghosh 2013-07-18, 21:13
+
Jarek Jarcec Cecho 2013-07-21, 16:57
+
varun kumar gullipalli 2013-07-23, 04:30
+
Jarek Jarcec Cecho 2013-07-23, 15:30