Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> RE: How to configure mapreduce archive size?


+
Xia_Yang@... 2013-04-10, 20:59
+
Arun C Murthy 2013-04-10, 21:44
+
Hemanth Yamijala 2013-04-11, 07:28
+
Xia_Yang@... 2013-04-11, 18:10
+
Xia_Yang@... 2013-04-11, 20:52
+
Hemanth Yamijala 2013-04-12, 04:09
+
Xia_Yang@... 2013-04-16, 17:45
+
Hemanth Yamijala 2013-04-17, 04:34
+
Xia_Yang@... 2013-04-17, 18:19
+
Hemanth Yamijala 2013-04-18, 04:11
+
Xia_Yang@... 2013-04-19, 00:57
Copy link to this message
-
Re: How to configure mapreduce archive size?
Well, since the DistributedCache is used by the tasktracker, you need to
update the log4j configuration file used by the tasktracker daemon. And you
need to get the tasktracker log file - from the machine where you see the
distributed cache problem.
On Fri, Apr 19, 2013 at 6:27 AM, <[EMAIL PROTECTED]> wrote:

> Hi Hemanth,****
>
> ** **
>
> I tried http://machine:50030. It did not work for me.****
>
> ** **
>
> In hbase_home/conf folder, I update the log4j configuration properties and
> got attached log. Do you find what is happening for the map reduce job?***
> *
>
> ** **
>
> Thanks,****
>
> ** **
>
> Jane****
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, April 17, 2013 9:11 PM
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: How to configure mapreduce archive size?****
>
> ** **
>
> The check for cache file cleanup is controlled by the
> property mapreduce.tasktracker.distributedcache.checkperiod. It defaults to
> 1 minute (which should be sufficient for your requirement).****
>
> ** **
>
> I am not sure why the JobTracker UI is inaccessible. If you know where JT
> is running, try hitting http://machine:50030. If that doesn't work, maybe
> check if ports have been changed in mapred-site.xml for a property similar
> to mapred.job.tracker.http.address. ****
>
> ** **
>
> There is logging in the code of the tasktracker component that can help
> debug the distributed cache behaviour. In order to get those logs you need
> to enable debug logging in the log4j configuration properties and restart
> the daemons. Hopefully that will help you get some hints on what is
> happening.****
>
> ** **
>
> Thanks****
>
> hemanth****
>
> ** **
>
> On Wed, Apr 17, 2013 at 11:49 PM, <[EMAIL PROTECTED]> wrote:****
>
> Hi Hemanth and Bejoy KS,****
>
>  ****
>
> I have tried both mapred-site.xml and core-site.xml. They do not work. I
> set the value to 50K just for testing purpose, however the folder size
> already goes to 900M now. As in your email, “After they are done, the
> property will help cleanup the files due to the limit set. ” How frequently
> the cleanup task will be triggered? ****
>
>  ****
>
> Regarding the job.xml, I cannot use JT web UI to find it. It seems when
> hadoop is packaged within Hbase, this is disabled. I am only use Hbase
> jobs. I was suggested by Hbase people to get help from Hadoop mailing list.
> I will contact them again.****
>
>  ****
>
> Thanks,****
>
>  ****
>
> Jane****
>
>  ****
>
> *From:* Hemanth Yamijala [mailto:[EMAIL PROTECTED]]
> *Sent:* Tuesday, April 16, 2013 9:35 PM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: How to configure mapreduce archive size?****
>
>  ****
>
> You can limit the size by setting local.cache.size in the mapred-site.xml
> (or core-site.xml if that works for you). I mistakenly mentioned
> mapred-default.xml in my last mail - apologies for that. However, please
> note that this does not prevent whatever is writing into the distributed
> cache from creating those files when they are required. After they are
> done, the property will help cleanup the files due to the limit set. ****
>
>  ****
>
> That's why I am more keen on finding what is using the files in the
> Distributed cache. It may be useful if you can ask on the HBase list as
> well if the APIs you are using are creating the files you mention (assuming
> you are only running HBase jobs on the cluster and nothing else)****
>
>  ****
>
> Thanks****
>
> Hemanth****
>
>  ****
>
> On Tue, Apr 16, 2013 at 11:15 PM, <[EMAIL PROTECTED]> wrote:****
>
> Hi Hemanth,****
>
>  ****
>
> I did not explicitly using DistributedCache in my code. I did not use any
> command line arguments like –libjars neither.****
>
>  ****
>
> Where can I find job.xml? I am using Hbase MapReduce API and not setting
> any job.xml.****
>
>  ****
>
> The key point is I want to limit the size of
> /tmp/hadoop-root/mapred/local/archive. Could you help?****
>
>  ****
>
> Thanks.****
>
>  ****
+
Xia_Yang@... 2013-04-23, 00:38
+
bejoy.hadoop@... 2013-04-16, 18:05