Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> reading distributed cache returns null pointer


Copy link to this message
-
Re: reading distributed cache returns null pointer
I am not sure why you are using getFileClassPaths() API to access files...
here is what works for us:

Add the file(s) to distributed cache using:
DistributedCache.addCacheFile(p.toUri(), conf);

Read the files on the mapper using:

URI[] uris = DistributedCache.getCacheFiles(conf);
// access one of the files:
paths[0] = new Path(uris[0].getPath());
// now follow hadoop or local file APIs to access the file...
Did you try the above and did it not work ?

-Rahul

On Thu, Jul 8, 2010 at 12:04 PM, abc xyz <[EMAIL PROTECTED]> wrote:

> Hello all,
>
> As a new user of hadoop, I am having some problems with understanding some
> things. I am writing a program to load a file to the distributed cache and
> read
> this file in each mapper. In my driver program, I have added the file to my
> distributed cache using:
>
>        Path p=new
> Path("hdfs://localhost:9100/user/denimLive/denim/DCache/Orders.txt");
>         DistributedCache.addCacheFile(p.toUri(), conf);
>
> In the configure method of the mapper, I am reading the file from cache
> using:
>             Path[] cacheFiles=DistributedCache.getFileClassPaths(conf);
>             BufferedReader joinReader=new BufferedReader(new
> FileReader(cacheFiles[0].toString()));
>
> however, the cacheFiles variable has null value in it.
>
> There is something mentioned on the Yahoo tutorial for hadoop about
> distributed
> cache which I do not understand:
>
> As a cautionary note: If you use the local JobRunner in Hadoop (i.e., what
> happens if you call JobClient.runJob()in a program with no or an empty
> hadoop-conf.xmlaccessible), then no local data directory is created; the
> getLocalCacheFiles()call will return an empty set of results. Unit test
> code
> should take this into account."
>
> what does this mean? I am executing my program in pseudo-distributed mode
> on
> windows using Eclipse.
>
> Any suggestion in this regard is highly valued.
>
> Thanks  in advance.
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB