Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - BatchWriter performance on 1.4


+
Slater, David M. 2013-09-18, 21:07
+
David Medinets 2013-09-19, 02:41
+
Slater, David M. 2013-09-19, 14:53
Copy link to this message
-
Re: BatchWriter performance on 1.4
John Vines 2013-09-18, 21:22
Currently the addMutation() code is synchronized, so that is a bottle neck.
A thread would get around this, but then there's then you need to manage
the thread properly.
On Wed, Sep 18, 2013 at 5:07 PM, Slater, David M.
<[EMAIL PROTECTED]>wrote:

> Hi, I’m running a single-threaded ingestion program that takes data from
> an input source, parses it into mutations, and then writes those mutations
> (sequentially) to four different BatchWriters (all on different tables).
> Most of the time (95%) taken is on adding mutations, e.g.
> batchWriter.addMutations(mutations); I am wondering how to reduce the time
> taken by these methods. ****
>
> ** **
>
> 1) For the method batchWriter.addMutations(Iterable<Mutation>), does it
> matter for performance whether the mutations returned by the iterator are
> sorted in lexicographic order? ****
>
> ** **
>
> 2) If the Iterable<Mutation> that I pass to the BatchWriter is very large,
> will I need to wait for a number of Batches to be written and flushed
> before it will finish iterating, or does it transfer the elements of the
> Iterable to a different intermediate list?****
>
> ** **
>
> 3) If that is the case, would it then make sense to spawn off short
> threads for each time I make use of addMutations?****
>
> ** **
>
> At a high level, my code looks like this:****
>
> ** **
>
> BatchWriter bw1 = connector.createBatchWriter(…)****
>
> BatchWriter bw2 = …****
>
> …****
>
> while(true) {****
>
> String[] data = input.getData();****
>
> List<Mutation> mutations1 = parseData1(data);****
>
>                 List<Mutation> mutations2 = parseData2(data);****
>
>                 …****
>
>                 bw1.addMutations(mutations1);****
>
>                 bw2.addMutations(mutations2);****
>
>                 …****
>
> }****
>
> ****
>
> Thanks,
> David****
>
+
Adam Fuchs 2013-09-19, 08:07
+
Keith Turner 2013-09-19, 16:39
+
Slater, David M. 2013-09-19, 21:08
+
Keith Turner 2013-09-19, 23:01
+
Slater, David M. 2013-09-20, 16:47
+
John Vines 2013-09-20, 18:50
+
Keith Turner 2013-09-20, 18:43