Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - Straggler problem in Accumulo BatchScans


Copy link to this message
-
Re: Straggler problem in Accumulo BatchScans
James Hughes 2013-08-21, 23:29
David,

Each tablet is hosted by one tablet server, and there's no way around
that.  (This is actually quite reasonably; otherwise, we would receive
duplicate results from multiple tablet servers.)

One strategy to deal with imbalanced data is to add a random partition
prefix to your row Ids.  This does complicate building queries, but in
general, you'll be able to leverage all of your nodes.  I did some testing
with the nodes of such random shard ids, and it seems like having 1-2x as
many shards as tablet servers worked pretty well.  (I'd suggest 2x in case
you ever grow your cloud.)

In particular, if you can reingest your data, prepend a random "01-14~" to
the beginning of each row Id, and see if that helps.  After that, you can
"help" Accumulo decide where it should split tablets with addSplits 01 02
<etc> 14 from the Accumulo shell (or programmatically with the addSplits).
After that, you can make sure that your 14+ splits are distributed across
the 7 nodes in a reasonable way.

I hope that helps,

Jim

http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#addSplits%28java.lang.String,%20java.util.SortedSet%29

On Wed, Aug 21, 2013 at 7:09 PM, Slater, David M.
<[EMAIL PROTECTED]>wrote:

> Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4.****
>
> ** **
>
> When I run large BatchScanner operations, the number of tablets scanned
> per node is not uniform, leading to the overloaded nodes taking much longer
> to finish than the others. For queries that require all of the scans to
> finish before returning, this is a major latency issue. What are some
> practical means of load-balancing this to reduce delay?****
>
> ** **
>
> Is it possible for tablets to be hosted on multiple tablet servers, up to
> the replication factor of the underlying hdfs? Are there reasons this might
> be an undesirable design?****
>
> ** **
>
> Thanks in advance,
> David ****
>