Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Unicode issues with Distributed Cache


Copy link to this message
-
Re: Unicode issues with Distributed Cache
Shahab Yunus 2013-05-04, 23:43
Anil, what issue are you facing? You have mentioned 'Unicode issue' but
what is exactly the issue?

Regards,
Shahab
On Sat, May 4, 2013 at 2:28 PM, AnilKumar B <[EMAIL PROTECTED]> wrote:

> Hi,
>
> We are adding ISO-8859-1 content type file in Distributed Cache for look
> up purpose in MR Job.
>
> But when we try to read the content from Distributed Cache file in MR, we
> are facing Unicode issues.
>
> Please find the sample code snippet below:
>                @Override
> protected void setup(Context context) throws java.io.IOException,
>  InterruptedException {
> Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context
> .getConfiguration());
>  lookUp = cacheFiles[0];
> File file = new File(lookUp.toString());
> reader = new BufferedReader(new InputStreamReader(new FileInputStream(
>  file), Charset.forName("ISO-8859-1")));
> String line;
> while ((line = reader.readLine()) != null) {
>  :
>  System.out.println(line);
> :
>  }
> reader.close();
> };
>
> But When try to read the same file manually, as below on same cluster
> machine, It's working fine.
>
> BufferedReader input = new BufferedReader(
> new InputStreamReader(new FileInputStream(path.toString()),
>  Charset.forName("ISO-8859-1")));
>
> May I know, Is this the Distributed Cache issue?
>
>
>