|
|
-
Local jobtracker in test env?
Mohit Anchlia 2012-08-07, 20:57
I just wrote a test where fs.default.name is file:/// and mapred.job.tracker is set to local. The test ran fine, I also see mapper and reducer were invoked but what I am trying to understand is that how did this run without specifying the job tracker port and which port task tracker connected with job tracker. It's not clear from the output:
Also what's the difference between this and bringing up miniDFS cluster?
INFO org.apache.hadoop.mapred.FileInputFormat [main]: Total input paths to proc ess : 1 INFO org.apache.hadoop.mapred.JobClient [main]: Running job: job_local_0001 INFO org.apache.hadoop.mapred.Task [Thread-11]: Using ResourceCalculatorPlugin : null INFO org.apache.hadoop.mapred.MapTask [Thread-11]: numReduceTasks: 1 INFO org.apache.hadoop.mapred.MapTask [Thread-11]: io.sort.mb = 100 INFO org.apache.hadoop.mapred.MapTask [Thread-11]: data buffer 79691776/99614 720 INFO org.apache.hadoop.mapred.MapTask [Thread-11]: record buffer 262144/32768 0 INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z ip 92127 INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z ip 1 INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z ip 92127 INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z ip 1 INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Starting flush of map output INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Finished spill 0 INFO org.apache.hadoop.mapred.Task [Thread-11]: Task:attempt_local_0001_m_00000 0_0 is done. And is in the process of commiting INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: file:/c:/upb/dp/manch lia-dp/depot/services/data-platform/trunk/analytics/geoinput/geo.dat:0+18 INFO org.apache.hadoop.mapred.Task [Thread-11]: Task 'attempt_local_0001_m_0000 00_0' done. INFO org.apache.hadoop.mapred.Task [Thread-11]: Using ResourceCalculatorPlugin : null INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: INFO org.apache.hadoop.mapred.Merger [Thread-11]: Merging 1 sorted segments INFO org.apache.hadoop.mapred.Merger [Thread-11]: Down to the last merge-pass, with 1 segments left of total size: 26 bytes INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: I nside reduce INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: O utside reduce INFO org.apache.hadoop.mapred.Task [Thread-11]: Task:attempt_local_0001_r_00000 0_0 is done. And is in the process of commiting INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: INFO org.apache.hadoop.mapred.Task [Thread-11]: Task attempt_local_0001_r_00000 0_0 is allowed to commit now INFO org.apache.hadoop.mapred.FileOutputCommitter [Thread-11]: Saved output of task 'attempt_local_0001_r_000000_0' to file:/c:/upb/dp/manchlia-dp/depot/servic es/data-platform/trunk/analytics/geooutput INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: reduce > reduce INFO org.apache.hadoop.mapred.Task [Thread-11]: Task 'attempt_local_0001_r_0000 00_0' done. INFO org.apache.hadoop.mapred.JobClient [main]: map 100% reduce 100% INFO org.apache.hadoop.mapred.JobClient [main]: Job complete: job_local_0001 INFO org.apache.hadoop.mapred.JobClient [main]: Counters: 15 INFO org.apache.hadoop.mapred.JobClient [main]: FileSystemCounters INFO org.apache.hadoop.mapred.JobClient [main]: FILE_BYTES_READ=458 INFO org.apache.hadoop.mapred.JobClient [main]: FILE_BYTES_WRITTEN=96110 INFO org.apache.hadoop.mapred.JobClient [main]: Map-Reduce Framework INFO org.apache.hadoop.mapred.JobClient [main]: Map input records=2 INFO org.apache.hadoop.mapred.JobClient [main]: Reduce shuffle bytes=0 INFO org.apache.hadoop.mapred.JobClient [main]: Spilled Records=4 INFO org.apache.hadoop.mapred.JobClient [main]: Map output bytes=20 INFO org.apache.hadoop.mapred.JobClient [main]: Total committed heap usage (bytes)=321527808 INFO org.apache.hadoop.mapred.JobClient [main]: Map input bytes=18 INFO org.apache.hadoop.mapred.JobClient [main]: SPLIT_RAW_BYTES=142 INFO org.apache.hadoop.mapred.JobClient [main]: Combine input records=0 INFO org.apache.hadoop.mapred.JobClient [main]: Reduce input records=2 INFO org.apache.hadoop.mapred.JobClient [main]: Reduce input groups=1 INFO org.apache.hadoop.mapred.JobClient [main]: Combine output records=0 INFO org.apache.hadoop.mapred.JobClient [main]: Reduce output records=1 INFO org.apache.hadoop.mapred.JobClient [main]: Map output records=2 INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [main]: Inside reduce INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [main]: Outsid e reduce Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.547 sec Results : Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
-
Re: Local jobtracker in test env?
Harsh J 2012-08-07, 21:08
It used the local mode of operation: org.apache.hadoop.mapred.LocalJobRunner
A JobTracker (via MiniMRCluster) is only required for simulating distributed tests.
On Wed, Aug 8, 2012 at 2:27 AM, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > I just wrote a test where fs.default.name is file:/// and > mapred.job.tracker is set to local. The test ran fine, I also see mapper > and reducer were invoked but what I am trying to understand is that how did > this run without specifying the job tracker port and which port task > tracker connected with job tracker. It's not clear from the output: > > Also what's the difference between this and bringing up miniDFS cluster? > > INFO org.apache.hadoop.mapred.FileInputFormat [main]: Total input paths to > proc > ess : 1 > INFO org.apache.hadoop.mapred.JobClient [main]: Running job: job_local_0001 > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using > ResourceCalculatorPlugin > : null > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: numReduceTasks: 1 > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: io.sort.mb = 100 > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: data buffer > 79691776/99614 > 720 > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: record buffer > 262144/32768 > 0 > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z > ip 92127 > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z > ip 1 > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z > ip 92127 > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: z > ip 1 > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Starting flush of map > output > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Finished spill 0 > INFO org.apache.hadoop.mapred.Task [Thread-11]: > Task:attempt_local_0001_m_00000 > 0_0 is done. And is in the process of commiting > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > file:/c:/upb/dp/manch > lia-dp/depot/services/data-platform/trunk/analytics/geoinput/geo.dat:0+18 > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > 'attempt_local_0001_m_0000 > 00_0' done. > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using > ResourceCalculatorPlugin > : null > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Merging 1 sorted segments > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Down to the last > merge-pass, > with 1 segments left of total size: 26 bytes > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: I > nside reduce > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [Thread-11]: O > utside reduce > INFO org.apache.hadoop.mapred.Task [Thread-11]: > Task:attempt_local_0001_r_00000 > 0_0 is done. And is in the process of commiting > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > attempt_local_0001_r_00000 > 0_0 is allowed to commit now > INFO org.apache.hadoop.mapred.FileOutputCommitter [Thread-11]: Saved > output of > task 'attempt_local_0001_r_000000_0' to > file:/c:/upb/dp/manchlia-dp/depot/servic > es/data-platform/trunk/analytics/geooutput > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: reduce > reduce > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > 'attempt_local_0001_r_0000 > 00_0' done. > INFO org.apache.hadoop.mapred.JobClient [main]: map 100% reduce 100% > INFO org.apache.hadoop.mapred.JobClient [main]: Job complete: > job_local_0001 > INFO org.apache.hadoop.mapred.JobClient [main]: Counters: 15 > INFO org.apache.hadoop.mapred.JobClient [main]: FileSystemCounters > INFO org.apache.hadoop.mapred.JobClient [main]: FILE_BYTES_READ=458 > INFO org.apache.hadoop.mapred.JobClient [main]: > FILE_BYTES_WRITTEN=96110 > INFO org.apache.hadoop.mapred.JobClient [main]: Map-Reduce Framework > INFO org.apache.hadoop.mapred.JobClient [main]: Map input records=2
Harsh J
-
Re: Local jobtracker in test env?
Mohit Anchlia 2012-08-07, 23:20
On Tue, Aug 7, 2012 at 2:08 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> It used the local mode of operation: > org.apache.hadoop.mapred.LocalJobRunner > > In localmode everything is done inside the same JVM i.e. tasktracker,jobtracker etc. all run in the same JVM. Or does it mean that none of those processes run everything is pipelined in the same process on the local file system. > A JobTracker (via MiniMRCluster) is only required for simulating > distributed tests. > > On Wed, Aug 8, 2012 at 2:27 AM, Mohit Anchlia <[EMAIL PROTECTED]> > wrote: > > I just wrote a test where fs.default.name is file:/// and > > mapred.job.tracker is set to local. The test ran fine, I also see mapper > > and reducer were invoked but what I am trying to understand is that how > did > > this run without specifying the job tracker port and which port task > > tracker connected with job tracker. It's not clear from the output: > > > > Also what's the difference between this and bringing up miniDFS cluster? > > > > INFO org.apache.hadoop.mapred.FileInputFormat [main]: Total input paths > to > > proc > > ess : 1 > > INFO org.apache.hadoop.mapred.JobClient [main]: Running job: > job_local_0001 > > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using > > ResourceCalculatorPlugin > > : null > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: numReduceTasks: 1 > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: io.sort.mb = 100 > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: data buffer > > 79691776/99614 > > 720 > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: record buffer > > 262144/32768 > > 0 > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: z > > ip 92127 > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: z > > ip 1 > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: z > > ip 92127 > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: z > > ip 1 > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Starting flush of map > > output > > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Finished spill 0 > > INFO org.apache.hadoop.mapred.Task [Thread-11]: > > Task:attempt_local_0001_m_00000 > > 0_0 is done. And is in the process of commiting > > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > > file:/c:/upb/dp/manch > > lia-dp/depot/services/data-platform/trunk/analytics/geoinput/geo.dat:0+18 > > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > > 'attempt_local_0001_m_0000 > > 00_0' done. > > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using > > ResourceCalculatorPlugin > > : null > > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Merging 1 sorted > segments > > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Down to the last > > merge-pass, > > with 1 segments left of total size: 26 bytes > > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: I > > nside reduce > > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup > [Thread-11]: O > > utside reduce > > INFO org.apache.hadoop.mapred.Task [Thread-11]: > > Task:attempt_local_0001_r_00000 > > 0_0 is done. And is in the process of commiting > > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: > > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > > attempt_local_0001_r_00000 > > 0_0 is allowed to commit now > > INFO org.apache.hadoop.mapred.FileOutputCommitter [Thread-11]: Saved > > output of > > task 'attempt_local_0001_r_000000_0' to > > file:/c:/upb/dp/manchlia-dp/depot/servic > > es/data-platform/trunk/analytics/geooutput > > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: reduce > > reduce > > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task > > 'attempt_local_0001_r_0000 > > 00_0' done. > > INFO org.apache.hadoop.mapred.JobClient [main]: map 100% reduce 100%
-
Re: Local jobtracker in test env?
Harsh J 2012-08-08, 06:21
Yes, singular JVM (The test JVM itself) and the latter approach (no TT/JT daemons).
On Wed, Aug 8, 2012 at 4:50 AM, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > On Tue, Aug 7, 2012 at 2:08 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> It used the local mode of operation: >> org.apache.hadoop.mapred.LocalJobRunner >> >> > In localmode everything is done inside the same JVM i.e. > tasktracker,jobtracker etc. all run in the same JVM. Or does it mean that > none of those processes run everything is pipelined in the same process on > the local file system. > > >> A JobTracker (via MiniMRCluster) is only required for simulating >> distributed tests. >> >> On Wed, Aug 8, 2012 at 2:27 AM, Mohit Anchlia <[EMAIL PROTECTED]> >> wrote: >> > I just wrote a test where fs.default.name is file:/// and >> > mapred.job.tracker is set to local. The test ran fine, I also see mapper >> > and reducer were invoked but what I am trying to understand is that how >> did >> > this run without specifying the job tracker port and which port task >> > tracker connected with job tracker. It's not clear from the output: >> > >> > Also what's the difference between this and bringing up miniDFS cluster? >> > >> > INFO org.apache.hadoop.mapred.FileInputFormat [main]: Total input paths >> to >> > proc >> > ess : 1 >> > INFO org.apache.hadoop.mapred.JobClient [main]: Running job: >> job_local_0001 >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using >> > ResourceCalculatorPlugin >> > : null >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: numReduceTasks: 1 >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: io.sort.mb = 100 >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: data buffer >> > 79691776/99614 >> > 720 >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: record buffer >> > 262144/32768 >> > 0 >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: z >> > ip 92127 >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: z >> > ip 1 >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: z >> > ip 92127 >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: z >> > ip 1 >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Starting flush of map >> > output >> > INFO org.apache.hadoop.mapred.MapTask [Thread-11]: Finished spill 0 >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: >> > Task:attempt_local_0001_m_00000 >> > 0_0 is done. And is in the process of commiting >> > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: >> > file:/c:/upb/dp/manch >> > lia-dp/depot/services/data-platform/trunk/analytics/geoinput/geo.dat:0+18 >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task >> > 'attempt_local_0001_m_0000 >> > 00_0' done. >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: Using >> > ResourceCalculatorPlugin >> > : null >> > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: >> > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Merging 1 sorted >> segments >> > INFO org.apache.hadoop.mapred.Merger [Thread-11]: Down to the last >> > merge-pass, >> > with 1 segments left of total size: 26 bytes >> > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: I >> > nside reduce >> > INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup >> [Thread-11]: O >> > utside reduce >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: >> > Task:attempt_local_0001_r_00000 >> > 0_0 is done. And is in the process of commiting >> > INFO org.apache.hadoop.mapred.LocalJobRunner [Thread-11]: >> > INFO org.apache.hadoop.mapred.Task [Thread-11]: Task >> > attempt_local_0001_r_00000 >> > 0_0 is allowed to commit now >> > INFO org.apache.hadoop.mapred.FileOutputCommitter [Thread-11]: Saved >> > output of >> > task 'attempt_local_0001_r_000000_0' to >> > file:/c:/upb/dp/manchlia-dp/depot/servic >> > es/data-platform/trunk/analytics/geooutput
Harsh J
|
|