usually automatic minor compactions are fine, but you may need much more free disk space to reclaim disk space via automatic minor compactions, especially in a time series use case with size-tiered compaction strategy (possibly with leveled as well, I’m not familiar with this strategy type). We are in the time series / STCS combination and currently plan to run a major compaction every X weeks. Although not perfect, this is currently our only way to effectively really get rid of out-dated data from disk, without the extra cost of storage we would additionally need, cause it needs a lot of time that delete markers (tombstones) according to our retention policy are actually getting automatically minor compacted with potentially large SSTables. Mind you, with pre 2.2, a major compaction results in a single (large) SSTable again, so the whole disk usage troubles start again. With 2.2+ there is an option to end up with SSTables in 50%, 25% etc.. in file size per column family / table, so this might be useful.
If you have a time series use case you may want to look at the new time window compaction strategy introduced in 3.0, but it relies on TTL-based time series data only. We tested it and it works great, but unfortunately we can’t use it, cause we may have different TTL/retention policies in a single column family, even varying retention configurations per customer over time, so TWCS not really an option here, unfortunately.
From: Akshit Jain [mailto:[EMAIL PROTECTED]] Sent: Donnerstag, 14. September 2017 08:50 To: [EMAIL PROTECTED] Subject: Compaction in cassandra
Is it helpful to run nodetool compaction in cassandra? or automatic compaction is just fine. Regards
The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext