Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Sqoop - utf-8 data load issue


Copy link to this message
-
Re: Sqoop - utf-8 data load issue
Thank you for the additional information Varun! Would you mind doing something like the following:

 hadoop dfs -text THE_FILE  | hexdump -C

And sharing the output? I'm trying to see the actual content of the file rather than any interpreted value.

Jarcec

On Mon, Jul 15, 2013 at 06:52:11PM -0700, varun kumar gullipalli wrote:
> Hi Jarcec,
>
> I am validating the data by running the following command,
>
> hadoop fs -text <hdfs cluster>
>
> I think there is no issue with the shell (correct me if am wrong) because I am connecting to MySQL database from the same shell(command line) and  could view the source data properly.
>
> Initially we observed that the following conf files doesn't have utf-8 encoding. 
> <?xml version="1.0" encoding="UTF-8"?>
>
> sqoop-site.xml
> sqoop=site-template.xml
>
> But no luck after making the changes too.
>
> Thanks,
> Varun
>
>
> ________________________________
>  From: Jarek Jarcec Cecho <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; varun kumar gullipalli <[EMAIL PROTECTED]>
> Sent: Monday, July 15, 2013 6:37 PM
> Subject: Re: Sqoop - utf-8 data load issue
>  
>
> Hi Varun,
> we are usually not seeing any issues with transferring text data in UTF. How are
> you validating the imported file? I can imagine that your shell might be messing
> the encoding.
>
> Jarcec
>
> On Mon, Jul 15, 2013 at 06:27:25PM -0700, varun kumar gullipalli wrote:
> >
> >
> > Hi,
> > I am importing data from MySql to HDFS using free-form query import.
> > It works fine but facing issue when the data is utf-8.The source(MySql) db is utf-8 compatible but looks like sqoop is converting the data during import.
> > Example - The source value - elémeñt is loaded as elémeñt to HDFS.
> > Please provide a solution for this.
> > Thanks in advance!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB