Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - RE: How to configure mapreduce archive size?


+
Xia_Yang@... 2013-04-10, 20:59
+
Arun C Murthy 2013-04-10, 21:44
+
Hemanth Yamijala 2013-04-11, 07:28
+
Xia_Yang@... 2013-04-11, 18:10
+
Xia_Yang@... 2013-04-11, 20:52
+
Hemanth Yamijala 2013-04-12, 04:09
+
Xia_Yang@... 2013-04-16, 17:45
+
Hemanth Yamijala 2013-04-17, 04:34
+
Xia_Yang@... 2013-04-17, 18:19
+
Hemanth Yamijala 2013-04-18, 04:11
+
Xia_Yang@... 2013-04-19, 00:57
Copy link to this message
-
Re: How to configure mapreduce archive size?
Hemanth Yamijala 2013-04-19, 03:54
Well, since the DistributedCache is used by the tasktracker, you need to
update the log4j configuration file used by the tasktracker daemon. And you
need to get the tasktracker log file - from the machine where you see the
distributed cache problem.
On Fri, Apr 19, 2013 at 6:27 AM, <[EMAIL PROTECTED]> wrote:

> Hi Hemanth,****
>
> ** **
>
> I tried http://machine:50030. It did not work for me.****
>
> ** **
>
> In hbase_home/conf folder, I update the log4j configuration properties and
> got attached log. Do you find what is happening for the map reduce job?***
> *
>
> ** **
>
> Thanks,****
>
> ** **
>
> Jane****
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, April 17, 2013 9:11 PM
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: How to configure mapreduce archive size?****
>
> ** **
>
> The check for cache file cleanup is controlled by the
> property mapreduce.tasktracker.distributedcache.checkperiod. It defaults to
> 1 minute (which should be sufficient for your requirement).****
>
> ** **
>
> I am not sure why the JobTracker UI is inaccessible. If you know where JT
> is running, try hitting http://machine:50030. If that doesn't work, maybe
> check if ports have been changed in mapred-site.xml for a property similar
> to mapred.job.tracker.http.address. ****
>
> ** **
>
> There is logging in the code of the tasktracker component that can help
> debug the distributed cache behaviour. In order to get those logs you need
> to enable debug logging in the log4j configuration properties and restart
> the daemons. Hopefully that will help you get some hints on what is
> happening.****
>
> ** **
>
> Thanks****
>
> hemanth****
>
> ** **
>
> On Wed, Apr 17, 2013 at 11:49 PM, <[EMAIL PROTECTED]> wrote:****
>
> Hi Hemanth and Bejoy KS,****
>
>  ****
>
> I have tried both mapred-site.xml and core-site.xml. They do not work. I
> set the value to 50K just for testing purpose, however the folder size
> already goes to 900M now. As in your email, “After they are done, the
> property will help cleanup the files due to the limit set. ” How frequently
> the cleanup task will be triggered? ****
>
>  ****
>
> Regarding the job.xml, I cannot use JT web UI to find it. It seems when
> hadoop is packaged within Hbase, this is disabled. I am only use Hbase
> jobs. I was suggested by Hbase people to get help from Hadoop mailing list.
> I will contact them again.****
>
>  ****
>
> Thanks,****
>
>  ****
>
> Jane****
>
>  ****
>
> *From:* Hemanth Yamijala [mailto:[EMAIL PROTECTED]]
> *Sent:* Tuesday, April 16, 2013 9:35 PM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: How to configure mapreduce archive size?****
>
>  ****
>
> You can limit the size by setting local.cache.size in the mapred-site.xml
> (or core-site.xml if that works for you). I mistakenly mentioned
> mapred-default.xml in my last mail - apologies for that. However, please
> note that this does not prevent whatever is writing into the distributed
> cache from creating those files when they are required. After they are
> done, the property will help cleanup the files due to the limit set. ****
>
>  ****
>
> That's why I am more keen on finding what is using the files in the
> Distributed cache. It may be useful if you can ask on the HBase list as
> well if the APIs you are using are creating the files you mention (assuming
> you are only running HBase jobs on the cluster and nothing else)****
>
>  ****
>
> Thanks****
>
> Hemanth****
>
>  ****
>
> On Tue, Apr 16, 2013 at 11:15 PM, <[EMAIL PROTECTED]> wrote:****
>
> Hi Hemanth,****
>
>  ****
>
> I did not explicitly using DistributedCache in my code. I did not use any
> command line arguments like –libjars neither.****
>
>  ****
>
> Where can I find job.xml? I am using Hbase MapReduce API and not setting
> any job.xml.****
>
>  ****
>
> The key point is I want to limit the size of
> /tmp/hadoop-root/mapred/local/archive. Could you help?****
>
>  ****
>
> Thanks.****
>
>  ****
+
Xia_Yang@... 2013-04-23, 00:38
+
bejoy.hadoop@... 2013-04-16, 18:05