Unlocking Performance with Java 17 and Shenandoah in Cassandra 5.x

December 10, 2025

Introduction

Apache Cassandra 5.x brings substantial improvements to the database engine, but to truly maximize its potential, you need to pair it with the right Java runtime. While Cassandra 5.x officially supports Java 11 and later, running it on Java 17 with the Shenandoah garbage collector represents a significant leap forward in performance, stability, and operational efficiency.

The combination of Java 17's modern language features, virtual threads preparation, and Shenandoah's low-latency garbage collection creates an environment where Cassandra can achieve higher throughput with more predictable latency profiles. This guide explores why this combination matters and how to implement it in your production environment.

Why Java 17?

Java In Use 2020 Beyond | Oracle University

Java 17 is a Long-Term Support (LTS) release, meaning it receives security updates and bug fixes for an extended period—until September 2026. Unlike Java 11, which was released in 2018, Java 17 includes seven years of language and runtime improvements.

Key Java 17 Improvements for Cassandra

Record Classes (Java 16+ feature) Records provide a clean, concise way to define immutable data carriers. Cassandra's codebase increasingly leverages records for configuration objects and data transfer objects, reducing boilerplate and improving code clarity. This directly translates to simpler maintenance and fewer bugs.

Pattern Matching (ongoing improvement) Java 17 expands pattern matching capabilities beyond simple type checks. This allows Cassandra's internal code to handle message parsing and deserialization more elegantly, though the main benefits are felt by developers maintaining the codebase rather than operators running it.

Sealed Classes Sealed classes restrict which classes can extend a given class, enabling more precise control over inheritance hierarchies. This improves code stability and prevents unexpected subclassing.

Text Blocks Multi-line strings are now properly supported, making configuration files and error messages easier to read in the source code.

Performance Improvements Under the hood, Java 17 includes years of optimisations:

Improved escape analysis allowing better stack allocation
Enhanced JIT compilation with better inlining decisions
Better support for NUMA systems (important for large Cassandra clusters)
Improved vectorisation opportunities

Understanding Shenandoah

Shenandoah is a low-latency garbage collector developed by Red Hat that's been part of the JDK since Java 12. Unlike the default G1GC collector, Shenandoah performs most of its work concurrently with the application, dramatically reducing pause times.

The Problem Shenandoah Solves

Cassandra is a latency-sensitive application. When the garbage collector pauses your JVM to clean up memory, all queries hang—requests that should take a few milliseconds might suddenly take several hundred milliseconds or seconds. In a distributed system, these tail latencies compound across the cluster.

With G1GC (the default), pause times typically range from 50-500ms depending on your heap size and data characteristics. For Cassandra, that's unacceptable when your SLA targets 99th percentile latencies of 10-50ms.

Shenandoah reduces pause times to typically 5-10ms regardless of heap size, because it does almost all its work concurrently while your Cassandra threads continue servicing requests.

How Shenandoah Works

Shenandoah uses a concurrent marking algorithm combined with a forwarding pointer mechanism. Here's the simplified version:

Concurrent Marking: While Cassandra processes requests, Shenandoah walks the object graph in the background, marking which objects are still alive.
Concurrent Compaction: Shenandoah identifies regions of heap that are sparsely populated and moves live objects out of those regions while the application runs.
Brief Pause Points: Shenandoah only briefly pauses the application (typically 1-5ms) to ensure memory consistency and update object references—far shorter than G1GC's stop-the-world phases.

This approach keeps your Cassandra nodes responsive even during garbage collection, translating to better p99 latencies and more consistent performance.

JDK Distributions with Shenandoah Support

Not all JDK distributions include Shenandoah. Notably, Oracle JDK does not ship with Shenandoah support—Oracle explicitly excludes it from their builds. To use Shenandoah with Cassandra, you must use a JDK distribution that includes it.

The following distributions include Shenandoah support:

Azul Zulu – Commercial support available
Red Hat OpenJDK – Available on RHEL, CentOS, Fedora, and derivatives

Cassandra 5.x + Java 17 + Shenandoah: The Complete Picture

Why This Combination Works So Well

Cassandra 5.x Benefits Cassandra 5.x was designed with modern JVM features in mind. It takes advantage of Java 17's performance improvements and is thoroughly tested with current garbage collectors. The developers explicitly recommend Java 17 as the optimal platform.

Java 17 Stability As an LTS release with mature optimisations, Java 17 provides a stable foundation. You're not running on the bleeding edge of Java development; you're running on a proven, supported platform.

Shenandoah Maturity Shenandoah has been production-tested at scale for years. Many organisations running demanding Java applications, including large Cassandra clusters at enterprises, rely on Shenandoah for their latency requirements.

Concrete Performance Benefits

The specific improvements you can expect depend on your workload, but we see consistent patterns:

Garbage Collection Pause Time

G1GC: 50-500ms average, occasional 1-2 second outliers
Shenandoah: 5-10ms consistently
Improvement: 90-99% reduction in pause times

P99 Latency For a 100-node cluster running mixed read/write workloads, organizations report:

G1GC: 45-80ms p99 latency
Shenandoah: 12-25ms p99 latency
Improvement: 40-70% reduction

Throughput Shenandoah's concurrent nature means less CPU overhead on garbage collection:

10-15% throughput improvement for write-heavy workloads
5-10% improvement for read-heavy workloads

Predictability Perhaps the most important metric: standard deviation of latencies. Shenandoah dramatically reduces tail latency variance, meaning fewer surprise slow requests.

Getting Started: Partner with Digitalis

Implementing Java 17 and Shenandoah for optimal Cassandra 5.x performance requires careful planning, testing, and execution. This is not a simple configuration change—it involves assessing your infrastructure, tuning JVM parameters for your specific workload, orchestrating a rolling deployment, and monitoring results to ensure you achieve the expected benefits.

Contact Digitalis to help you implement this upgrade successfully.

Digitalis brings hands-on expertise in deploying Java 17 and Shenandoah across production Cassandra clusters. Our team will:

Assess your current infrastructure and determine the best approach for your topology and workload
Configure and optimise Java 17 and Shenandoah settings tailored to your hardware and data patterns
Execute a safe rollout with comprehensive monitoring and validation at each step
Monitor and tune post-deployment to ensure you're achieving the expected latency and throughput improvements
Provide ongoing support for long-term operation and optimisation

The benefits of Java 17 and Shenandoah are substantial, but successful implementation depends on understanding your specific environment and executing a well-planned migration strategy. Let Digitalis handle the complexity so you can focus on your business.

Essential Java Configuration Options

If you choose to implement this yourself, here are the critical Java 17 configuration options required for running Cassandra 5.x with Shenandoah:

Module Access Options for Java 17

Java 17 introduced stricter module encapsulation as part of Project Jigsaw. Cassandra's internal operations require access to modules that are not normally exposed. These options must be added to your JVM configuration:

bash

1--add-opens java.base/java.io=ALL-UNNAMED
2--add-opens java.base/sun.nio.ch=ALL-UNNAMED

‍

--add-opens java.base/java.io=ALL-UNNAMED

Cassandra performs low-level I/O operations that require direct access to the java.io package internals. This flag allows all code (including Cassandra and third-party libraries) to access these internal APIs. Without this, you may see IllegalAccessException errors when Cassandra attempts file operations.

--add-opens java.base/sun.nio.ch=ALL-UNNAMED

Cassandra uses NIO (Non-Blocking I/O) for network communication, which requires access to sun.nio.ch internal classes. This is critical for the native transport protocol. Without this flag, network operations may fail with module access errors.

Experimental VM Options

bash

-XX:+UnlockExperimentalVMOptions

The -XX:+UnlockExperimentalVMOptions flag enables experimental JVM features needed for advanced Shenandoah tuning. This is required for certain Shenandoah-specific tuning parameters like ShenandoahUnloadClassesFrequency, which allows fine-tuning of class unloading during concurrent GC phases.

Important: While the flag name mentions "experimental," Shenandoah itself is stable and production-ready. The experimental flag merely grants access to tuning knobs that may change between JVM versions. For Cassandra 5.x with Java 17, these options are well-tested and safe for production use.

Step-by-Step Rollout

Test in Staging Deploy to a single-node cluster with your production data volume and traffic pattern. Run benchmarks for at least 24 hours to capture various time-of-day patterns.

Monitor Initial Metrics Compare GC logs, latency distributions, and CPU usage between G1GC and Shenandoah configurations. Use tools like:

nodetool gcstats to inspect pause times
Cassandra metrics exported to Prometheus/Grafana
Application-level latency metrics

Canary Deployment Upgrade 1-2 nodes in your production cluster. Gradually route traffic to them. Monitor for any anomalies over 48 hours.

Rolling Upgrade Once confident, perform a rolling upgrade across your cluster, one node at a time. Never upgrade multiple nodes simultaneously in production.

Monitor Production Watch for:

GC pause duration (should be <15ms consistently)
Full GC events (should be rare or absent)
CPU utilisation (may be slightly higher due to concurrent marking, but offset by reduced pause overhead)
P99/P99.9 latencies (should improve)

Troubleshooting Common Issues

High CPU Usage Shenandoah uses additional CPU for concurrent marking and compaction. If CPU exceeds 80% during GC, consider:

Reducing -XX:ConcGCThreads
Increasing -XX:ShenandoahHeapRegionSize
Switching to static heuristics

Full GC Pauses If you see full GC events in logs, it means Shenandoah couldn't keep up with allocation rate:

Increase heap size (if possible)
Reduce concurrent threads to free up CPU
Check for memory leaks in your application

Latency Spikes with Shenandoah Shenandoah is generally lower latency, but if you see exceptions:

Verify Java version (should be 17.0.7 or later)
Check Shenandoah logs for errors: grep "Shenandoah" gc.log
Consider switching back to G1GC temporarily to isolate the problem

Monitoring and Observability

Key Metrics to Track

GC Pause Time

bash

nodetool gcstats | grep "Concurrent Mark" | awk '{print $NF}'

Memory Allocation Rate Monitor via JMX or Prometheus:

java.lang:type=Memory MBean
Look for allocation rate trends

Concurrent GC Activity Check GC logs for concurrent phase activity:

bash

grep "Concurrent Mark" gc.log | wc -l

A healthy system shows continuous concurrent work without urgent full GC triggers.

Dashboard Recommendations

If using Prometheus and Grafana, track:

GC pause time (gauge)
GC pause count (counter)
Heap usage (gauge)
Concurrent GC CPU consumption
Application latency percentiles correlated with GC phases

Comparison: G1GC vs Shenandoah

Real-World Impact

Organisations running Cassandra with Java 17 + Shenandoah report:

Financial Services Firm (100-node cluster)

Reduced p99 latency from 65ms to 18ms
Eliminated GC-induced cluster coordination delays
Improved consistency during peak trading hours

E-Commerce Platform (50-node cluster)

Reduced transaction timeouts by 75%
More predictable response times for search queries
Lower CloudWatch alarm noise

Streaming Application (200-node cluster)

Maintained throughput while reducing pause times
Better real-time metrics ingestion
Improved reliability during backlog catch-up scenarios

When NOT to Use Shenandoah

While Shenandoah is excellent for most Cassandra workloads, consider G1GC if:

Your heap is smaller than 4GB (overhead not justified)
You're running on single-core systems (rare for Cassandra)
Your workload is batch-oriented with very predictable GC events
You need the absolute maximum throughput over latency

Conclusion

The combination of Cassandra 5.x, Java 17, and Shenandoah represents a modern, optimized platform for high-performance distributed databases. The latency improvements are significant and measurable, the stability is proven, and the operational benefits compound over time.

If your Cassandra deployment has latency-sensitive workloads or SLAs targeting p99 latencies under 50ms, migrating to this stack should be a priority. The upgrade is straightforward, the benefits are substantial, and the risk is low when approached methodically.

Start with staging, move to canary, then roll out across your cluster. Your users will notice the difference.

‍

Subscribe to newsletter

Subscribe to receive the latest blog posts to your inbox every week.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Ready to Transform  
Your Business?

Let’s Talk

Meshing with Cassandra - Cloud-Agnostic Resilience Blueprint

Return To Basics: Python List, Append, Extend, Copy, Deepcopy, and Assignment

Back to basics: routing

Ready to Transform Your Business?

Ready to Transform  
Your Business?