Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Quick question

Copy link to this message
Re: Quick question
This is the most important thing that you have said. The map function
is called once per unit of input but the mapper object persists for
many input units of input.

You have a little bit of control over how many mapper objects there
are and how many machines they are created on and how many pieces your
input is broken into.  That control is limited, however, unless you
build your own input format. The standard input formats are optimized
for very large inputs and may not give you the flexibility that you
want for your experiments. That is unfortunate for the purpose of
learning about hadoop but hadoop is designed mostly for dealing with
very large data and isn't usually designed to be easy to understand.
Where easy coincides with powerful then easy is good but powerful
isn't always easy.

On Sunday, February 20, 2011, maha <[EMAIL PROTECTED]> wrote:
> So first question: is there a difference between Mappers and maps ?