ScyllaDB Enterprise 2022.2 is here! It offers new features, plus over 100 stability and performance fixes for our popular NoSQL database. ScyllaDB Enterprise is available as a standalone self-hosted product, and serves as the engine used within our fully-managed Database-as-a-Service, ScyllaDB Cloud.
New Release Cycle for Enterprise Customers
ScyllaDB Enterprise 2022.1 was introduced last year for both stand-alone self-hosted use, whether on the public cloud or on premises. ScyllaDB Enterprise also serves as the engine at the heart of ScyllaDB Cloud, our fully-managed Database-as-a-Service (DBaaS). As we promised, we have now delivered a feature-based release to keep ScyllaDB Enterprise more closely in sync with the release cadence of our ScyllaDB Open Source project. Thus, ScyllaDB Enterprise 2022.2 mirrors the production-ready features available in ScyllaDB Open Source 5.1.
Alternator (DynamoDB Compatible) TTL
In ScyllaDB Open Source 5.0, we introduced Time To Live (TTL) to the DynamoDB compatible API (Alternator) as an experimental feature. In ScyllaDB Enterprise 2022.2, we promote it to production ready.
Like in DynamoDB, Alternator items that are set to expire at a specific time will not disappear precisely at that time, but only after some delay. DynamoDB guarantees that the expiration delay will be less than 48 hours (though for small tables, the delay is often much shorter). In Alternator, the expiration delay is configurable — it defaults to 24 hours, but can be set with the --alternator-ttl-period-in-seconds
configuration option.
Rate Limit Per Partition
We’re now allowing you to put per-partition rate limits for reads or writes per second. Consider the following CQL example:
CREATE TABLE tab ( ... )
WITH PER PARTITION RATE LIMIT = {
'max_writes_per_second': 150,
'max_reads_per_second': 400
};
You can set different rates for writes and reads per partition. Queries exceeding these limits will be rejected. This helps the database avoid hot partition problems or mitigate issues external to the database, such as spam bots. This feature pairs well with ScyllaDB’s shard-aware drivers because rejected requests will have the least cost. You can read more about this feature in the ScyllaDB documentation and in the feature design note on Github.
Load and Stream
Historically when restoring from backup, you needed to place the restored SSTables on the same number of nodes as the original cluster. But sometimes you may find the cluster topology has changed radically by adding or removing nodes from the topology used when the backup was made. If that’s the case, then the new token distribution will be mismatched to the SSTables. With ScyllaDB’s Load and Stream feature, you don’t need to worry about the details of what the cluster topology was when you made your backup. You can just place the SSTables on one of the nodes of the current cluster, run nodetool refresh, and the system will automatically determine how to reshard and rebalance the partitions across the cluster, streaming data to the new owning nodes.
Performance: Eliminate Exceptions from Read and Write Path
When a coordinator times out, it generates an exception which is then caught in a higher layer and converted to a protocol message. Since exceptions are slow, this can make a node that experiences timeouts become even slower. To prevent that, the coordinator write path and read path has been converted not to use exceptions for timeout cases, treating them as another kind of result value instead. Further work on the read path and on the replica reduces the cost of timeouts, so that goodput is preserved while a node is overloaded. This results in the following performance improvement:
You can read more about the commands to generate this exception elimination in the release notes.
Prune Materialized Views
Another new feature is a CQL extension, PRUNE MATERIALIZED VIEW
. This statement can be used to remove inconsistent rows, known as “ghost rows,” from materialized views.
ScyllaDB’s materialized views have been production-ready since ScyllaDB Open Source 3.0, and we continuously strive to make them even more robust and problem-free. A ghost row is an inconsistency issue which manifests itself by having rows in a materialized view which does not correspond to any base table rows. Such inconsistencies should be prevented altogether and ScyllaDB strives to avoid them. Yet if they happen, this statement can be used to restore a materialized view to a fully consistent state without rebuilding it from scratch.
Example usages:
PRUNE MATERIALIZED VIEW my_view;
PRUNE MATERIALIZED VIEW my_view WHERE v = 19;
PRUNE MATERIALIZED VIEW my_view WHERE token(v) > 7 AND token(v) < 1535250;
Distributed SELECT COUNT (*)
Historically, we know full well that counting all the rows in a table can be slow. It can even result in a ReadTimeout
error. This is because it does a full-scan query on all nodes. And while we’ve implemented USING TIMEOUT
to alter the normal timeout of operations, wouldn’t it be better to just make the query run faster?
Well, now it is. Because in ScyllaDB Enterprise 2022.2 we’ve added a distributed SELECT COUNT
feature. In the original implementation, the coordinator node sends a request and gets a count returned from one node, then asks the next node, and so on around the cluster ring sequentially. Each node needs to return a count to the coordinator before it goes on to ask the next node for its count. For a large cluster with many nodes, this can artificially bottleneck the process.
Starting with this release, we’ve created a new node role for such queries, known as the “super coordinator.” The query is divided into separate workloads, with each sent out from the super coordinator to other selected coordinator nodes around the cluster. The coordinators each work with a part of the overall query, polling the individual nodes where data is stored, then marshaling and returning results back to the super coordinator. The super coordinator gathers all the responses from the coordinators and then sends the collective SELECT COUNT
response back to the client.
The advantage of this approach is that SELECT COUNT
results can complete much faster by running in parallel across all the nodes in the cluster. By allowing the query to occur in parallel, we were able to get a SELECT COUNT (*)
operation running against a 96 vCPU cluster that used to return results in 86 seconds down to only 2 seconds response time.
The corresponding disadvantage is that it will impact all the vCPUs in the cluster at once, so we suggest being judicious in running such expensive queries.
This feature is a natural complement to Workload Prioritization to put limits upon how intensive its impact will be to your production clusters.
Getting ScyllaDB Enterprise 2022.2
If you are an existing ScyllaDB Enterprise customer, we encourage you to upgrade your clusters in coordination with our support team.
Note that since ScyllaDB Cloud is a fully-managed solution, all clusters will be automatically upgraded by our support team in due time.
Finally, if you are interested in ScyllaDB Enterprise, you can try out our 30 day trial offer. However, trials and proof of concepts (POCs) for enterprise software have the highest success rate when conducted in coordination with our team of experts. Before you start the trial timer ticking, make sure you contact us directly so we can help you achieve your goals.