We faced an issue recently that the more map tasks are completed, the
longer it takes to complete one more map task.
In our architecture we have two scanners to read the table. The first one,
which is called 'outer' scanner is reading table and filter some rowkeys.
These rowkeys are used as a filter for second scanner - 'internal'. Thus we
constantly open 'internal' scanner with different filters.
As an additional symptoms we see that our cluster practically does nothing
- there is no CPU loading, no disk loading, no network, etc. Most of the
time it means we are waiting on some locks, but I'm not sure.
I would appreciate any ideas or suggestions to understand the case.
Thank you in advance.
Developer Grid Dynamics