|
|
Kartashov, Andy 2012-11-23, 21:36
Guys, I know that there is old and new API for MapReduce. The old API is found under org.apache.hadoop.mapred and the new is under org.apache.hadoop.mapreduce I successfully used both (the old and the new API) writing my MapReduce drivers. The problem came up when I tried to use distributed cache. My new API Job object could not locate public void addCacheFile< http://hadoop.apache.org/docs/current/api/src-html/org/apache/hadoop/mapreduce/Job.html#line.1016>(URI<http://download.oracle.com/javase/6/docs/api/java/net/URI.html?is-external=true> uri) method and I was scratching my head why. What I did not reaslise is that despite new and oold API there is also Hadoop 0.20 vs Hadoop 2.0.0 APIs that use exact same packages. The old Hadoop.0.20.00 new Mapreduce API class Job simply doesn't have that method "addCacheFile< http://hadoop.apache.org/docs/current/api/src-html/org/apache/hadoop/mapreduce/Job.html#line.1016>(URI<http://download.oracle.com/javase/6/docs/api/java/net/URI.html?is-external=true> uri)". I am running Hadoop 2.0.0. so could not understand why the method was not inside the class. I ended up rewriting MR job under old API mapred package and ran soccessfully. Can anyone shed some light on this? Thanks AK NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel
Harsh J 2012-11-24, 07:21
You could use the org.apache.hadoop.filecache.DistributedCache API as:
DistributedCache.addCacheFile(URI, job.getConfiguration());
On Sat, Nov 24, 2012 at 3:06 AM, Kartashov, Andy <[EMAIL PROTECTED]> wrote: > Guys, > > > > I know that there is old and new API for MapReduce. The old API is found > under org.apache.hadoop.mapred and the new is under > org.apache.hadoop.mapreduce > > > > I successfully used both (the old and the new API) writing my MapReduce > drivers. > > > > The problem came up when I tried to use distributed cache. My new API Job > object could not locate > > public void addCacheFile(URI uri) method and I was scratching my head why. > > > > What I did not reaslise is that despite new and oold API there is also > Hadoop 0.20 vs Hadoop 2.0.0 APIs that use exact same packages. > > The old Hadoop.0.20.00 new Mapreduce API class Job simply doesn’t have that > method “addCacheFile(URI uri)”. > > > > I am running Hadoop 2.0.0. so could not understand why the method was not > inside the class. I ended up rewriting MR job under old API mapred package > and ran soccessfully. > > > > Can anyone shed some light on this? > > > > Thanks > > AK > > > > > > NOTICE: This e-mail message and any attachments are confidential, subject to > copyright and may be privileged. Any unauthorized use, copying or disclosure > is prohibited. If you are not the intended recipient, please delete and > contact the sender immediately. Please consider the environment before > printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui > l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent > être couverts par le secret professionnel. Toute utilisation, copie ou > divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire > prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. > Veuillez penser à l'environnement avant d'imprimer le présent courriel
-- Harsh J
Mahesh Balija 2012-11-27, 05:22
Hi AK,
I don't really understand what is stopping you to use the job.getConfiguration() method to pass the configuration instance to DistributedCache.addCacheFile(URI, job.getConfiguration()). Only thing you need to do is pass the URI and configuration object (getting it from org.apache.hadoop.mapreduce.Job instance).
Best, Mahesh.B. Calsoft Labs.
On Mon, Nov 26, 2012 at 8:18 PM, Kartashov, Andy <[EMAIL PROTECTED]>wrote:
> Harsh, > > Thanks for the " DistributedCache.addCacheFile(URI, > job.getConfiguration());" suggestion. > What class is your instance job belongs to? It is not Job class, for sure. > So must be JobContext? > > When I write my driver using new API I write: > > Job job = new Job(); > job.setJarByClass(.... > job.setJobName(... > job.setSetMapOutputKey... | .. value > ......Redeuce..... > > > So, how can I use your piece of code here, i.e? > DistributedCache.addCacheFile(URI, job.getConfiguration()); > > How can I wire JobConf to Job instances? > > Thanks, > AK > > -----Original Message----- > From: Harsh J [mailto:[EMAIL PROTECTED]] > Sent: Saturday, November 24, 2012 2:22 AM > To: <[EMAIL PROTECTED]> > Subject: Re: MapReduce APIs > > You could use the org.apache.hadoop.filecache.DistributedCache API as: > > DistributedCache.addCacheFile(URI, job.getConfiguration()); > > On Sat, Nov 24, 2012 at 3:06 AM, Kartashov, Andy <[EMAIL PROTECTED]> > wrote: > > Guys, > > > > > > > > I know that there is old and new API for MapReduce. The old API is > > found under org.apache.hadoop.mapred and the new is under > > org.apache.hadoop.mapreduce > > > > > > > > I successfully used both (the old and the new API) writing my > > MapReduce drivers. > > > > > > > > The problem came up when I tried to use distributed cache. My new API > > Job object could not locate > > > > public void addCacheFile(URI uri) method and I was scratching my head > why. > > > > > > > > What I did not reaslise is that despite new and oold API there is also > > Hadoop 0.20 vs Hadoop 2.0.0 APIs that use exact same packages. > > > > The old Hadoop.0.20.00 new Mapreduce API class Job simply doesn't > > have that method "addCacheFile(URI uri)". > > > > > > > > I am running Hadoop 2.0.0. so could not understand why the method was > > not inside the class. I ended up rewriting MR job under old API mapred > > package and ran soccessfully. > > > > > > > > Can anyone shed some light on this? > > > > > > > > Thanks > > > > AK > > > > > > > > > > > > NOTICE: This e-mail message and any attachments are confidential, > > subject to copyright and may be privileged. Any unauthorized use, > > copying or disclosure is prohibited. If you are not the intended > > recipient, please delete and contact the sender immediately. Please > > consider the environment before printing this e-mail. AVIS : le > > présent courriel et toute pièce jointe qui l'accompagne sont > > confidentiels, protégés par le droit d'auteur et peuvent être couverts > > par le secret professionnel. Toute utilisation, copie ou divulgation > > non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de > ce courriel, supprimez-le et contactez immédiatement l'expéditeur. > > Veuillez penser à l'environnement avant d'imprimer le présent courriel > > > > -- > Harsh J > NOTICE: This e-mail message and any attachments are confidential, subject > to copyright and may be privileged. Any unauthorized use, copying or > disclosure is prohibited. If you are not the intended recipient, please > delete and contact the sender immediately. Please consider the environment > before printing this e-mail. AVIS : le présent courriel et toute pièce > jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur > et peuvent être couverts par le secret professionnel. Toute utilisation, > copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le > destinataire prévu de ce courriel, supprimez-le et contactez immédiatement > l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent
|
|