-Re: Accumulo Input Format over hadoop blocks
Roshan Punnoose 2012-07-09, 18:52
Thanks, that makes perfect sense. My assumption that the mapper is pulling
the data from the hadoop blocks was wrong. Thanks for the full explanation,
that really helps.
On Mon, Jul 9, 2012 at 2:43 PM, John Vines <[EMAIL PROTECTED]> wrote:
> On Mon, Jul 9, 2012 at 2:24 PM, Roshan Punnoose <[EMAIL PROTECTED]> wrote:
>> This might be a very easy question, but I was wondering how the Accumulo
>> Input Format handled a tablet file splitting over multiple nodes.
>> For example, if I have a tablet file that is 1GB large, where my hadoop
>> block size is 256MB. Then there is a possibility that up to 4 nodes could
>> be holding the data from my tablet file. However, when Accumulo Input
>> Format creates mappers, it creates a mapper for every tablet. This might
>> mean that 3 blocks are transferred over the network to where the mapper is
>> running to ensure data locality.
>> Am I correct in this assumption? Or is there something else the
>> TabletServer is doing underneath to make sure all the data actually resides
>> in one server, so there is no network overhead of moving blocks before a
>> Map Reduce job.
> If a single file spans 4 HDFS blocks, there is a reasonable assumption
> that a single datanode possesses all 4 blocks of that one file (it's an
> assumption because if the datanode died and data was rereplicated that
> guarantee is lost). The node which possesses all 4 blocks is the same as
> the tserver who wrote that data. More likely than not, that file was
> written by a tserver at major compaction time. Factoring that with our
> attempts to do unnecessary migrations, then in most cases you will see
> minimal data over the network. Yes, occasionally you will do some over the
> network transfers due to tablet migrations, data that hasn't been compacted
> in a while, nodes failures, etc., but these are by no means the norm.
> For a bit more education, when using the Accumulo Input Format, the mapper
> task is actually talking to the tserver, and only the tserver, for reading
> in data. This is because the tablet server is doing a merged read of the
> data, applying all scan time iterators (including visibility filtering),
> and then giving results back to the Mapper. So even if there were blocks
> over the network, there really couldn't be anything done in the MapReduce
> job to ensure locality because you can't have partial tablets handled
> because of the way deletes, versioning, and aggregation work. If there are
> concerns about locality on your system, forcing a compaction will ensure
> data locality, but this really isn't necessary unless your system has had a
> lot of failures or oddly distributed ingest.