Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Uneven distribute of Hosted Tablets?


+
Ott, Charles H. 2013-05-30, 15:22
+
David Medinets 2013-05-30, 20:33
+
Ott, Charles H. 2013-05-30, 20:40
+
John Vines 2013-05-30, 21:30
+
Ott, Charles H. 2013-05-31, 14:00
+
Billie Rinaldi 2013-05-31, 15:02
+
Ott, Charles H. 2013-05-31, 15:10
+
Billie Rinaldi 2013-05-31, 16:14
+
Ott, Charles H. 2013-05-31, 16:39
+
Billie Rinaldi 2013-05-31, 16:47
+
Ott, Charles H. 2013-05-31, 16:53
+
Billie Rinaldi 2013-05-31, 17:31
+
Ott, Charles H. 2013-05-31, 17:37
+
Billie Rinaldi 2013-05-31, 17:57
+
Ott, Charles H. 2013-05-31, 18:08
Copy link to this message
-
Re: Uneven distribute of Hosted Tablets?
You could also lower the split threshold (do a `config -t <table>` and
you'll see a parameter with a similar name) and then compact the table.

How are you ingesting data? I believe that adding monotonically increasing
keys can lead to a pattern where only the last tablet is being added to and
split (not 100% on this). If you know some distribution for the keys you're
adding, it might be a good idea to add split points to the table to
increase parallelism.
On Fri, May 31, 2013 at 10:00 AM, Ott, Charles H. <[EMAIL PROTECTED]>wrote:

> I performed a clean shutdown and startup of all the processes using the
> start-all.sh/stop-all.sh scripts.****
>
> ** **
>
> The systems have only been online for about 5 minutes and everything is
> working.  But I see the following Recent WARN in the Logs:****
>
> ** **
>
> time
> application                          count    level      message****
>
> 31 09:37:57,0774               tserver:1620-accumulo  1
> WARN   Future location is not to this server for the root tablet****
>
> ** **
>
> Hosted tablet distribution seems to be worse:****
>
> ** **
>
> (Image Below Here)****
>
>
> (Image Above Here)****
>
> ** **
>
> I am able to login and scans seems to be responsive.   I noticed that when
> we had our entries ~20 M count, our batch scans were taking much longer.  I
> was hoping that by distributing the tablets evenly, and splitting some of
> the bigger tables, we could get better performance.
>
> ****
>
> As for splitting the bigger table, I received a message from a peer.  He
> mentioned that I could create a new table and split it on the values I
> want.  Then use Map reduce job to move the data from the single tablet
> table to split table.  ****
>
> ** **
>
> *From:* [EMAIL PROTECTED][mailto:
> [EMAIL PROTECTED]] *On Behalf
> Of *John Vines
> *Sent:* Thursday, May 30, 2013 5:30 PM
> *To:* [EMAIL PROTECTED]
> *Cc:* Lahr-Vivaz, Emilio F.
>
> *Subject:* Re: Uneven distribute of Hosted Tablets?****
>
> ** **
>
> Your distribution is cause for concern. I thought we had resolved a lot of
> the balancer issues in 1.4.1 or 1.4.2. Are you seeing any errors from the
> master in your logs? Worst case scenario is you just have to kill the
> master process and start it back up and you should see things balancing out.
> ****
>
> ** **
>
> On Thu, May 30, 2013 at 4:40 PM, Ott, Charles H. <[EMAIL PROTECTED]>
> wrote:****
>
> Thanks for the feedback.  I will keep what you said in mind.****
>
>  ****
>
> *From:* [EMAIL PROTECTED][mailto:
> [EMAIL PROTECTED]] *On Behalf
> Of *David Medinets
> *Sent:* Thursday, May 30, 2013 4:34 PM
> *To:* accumulo-user
> *Subject:* Re: Uneven distribute of Hosted Tablets?****
>
>  ****
>
> Don't worry about splits until you have a few billion entries and a lot
> more servers. What you're seeing now is just a bad signal to noise ratio.*
> ***
>
>  ****
>
> On Thu, May 30, 2013 at 11:22 AM, Ott, Charles H. <[EMAIL PROTECTED]>
> wrote:****
>
> First I want to say thanks to the you all.  The information provided by
> this mailing list has been invaluable to me and I appreciate it.****
>
>  ****
>
> My newest concern is the uneven allocation of hosted tablets across my
> tablet servers:****
>
>  ****
>
> (Image Pasted below here)****
>
> ****
>
> (Image Pasted above here)****
>
>  ****
>
> I have been reading about pre-splitting tables in the Accumulo guide.  But
> I am not sure if that would be the ‘fix’ for this.  (Or even if this needs
> fixing.)****
>
>  ****
>
> I have 3 tables that could potentially grow to *n* number of records.
> Currently of those tables (and there single tablet) reside on the
> 1620-accumulo server (Hosting 24 tablets).****
>
>  ****
>
> Since there is already several entries on those tables, would splitting
> them be appropriate?  Does splitting guarantee that the new tablets will be
+
Ott, Charles H. 2013-05-31, 14:33
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB