Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> How to submit Tool jobs programatically in parallel?


Copy link to this message
-
How to submit Tool jobs programatically in parallel?
I'm submitting unrelated jobs programmatically (using AWS EMR) so they run
in parallel.

I'd like to run an s3distcp job in parallel as well, but the interface to
that job is a Tool, e.g. ToolRunner.run(...).

ToolRunner blocks until the job completes though, so presumably I'd need to
create a thread pool to run these jobs in parallel.

But creating multiple threads to submit concurrent jobs via ToolRunner,
blocking on the jobs completion, just feels improper. Is there an
alternative?