The latest version of Apache Cassandra, an open Source, highly performant, distributed big data database management platform, has been published this week by the Apache Cassandra Project. Version 4.0 would come with greatly improved performance and management.
“A long time coming, Cassandra 4.0 is the most thoroughly tested Cassandra yet,” said Nate McCall, Vice President of Apache Cassandra and Software Engineer at Apple. “The latest version is faster, more scalable, and bolstered with enterprise security features, ready-for-production with unprecedented scale in the cloud.”
With millions of writes per second, Cassandra v4.0 would readily manage unstructured data. v4.0 is the culmination of three years of work and includes over 1,000 bug fixes, enhancements, and new features, including:
- Increased speed and scalability – Data would be streamed up to 5 times quicker during scaling operations, with “up to 25%” faster read and write throughput, resulting in a more elastic architecture, especially in cloud and Kubernetes deployments.
- Improved consistency – keeps data replicas in sync for quicker, more efficient operation and consistency among data replicas.
- Enhanced security and observability – With little impact on workload performance, audit logging records users’ access and activities. The new capture and replay feature allows users to analyze production workloads to guarantee regulatory and security compliance with SOX, PCI, GDPR, and other regulations.
- New configuration settings – Operators may take use of accessible system metrics and configuration options to ensure they have quick access to data that helps them improve deployments.
- Minimized latency – As heap sizes grow, garbage collector wait periods are decreased to a few milliseconds with minimal latency impact.
- Better compression – Improved compression efficiency reduces the amount of data that must be stored on disk and increases read performance.
Cassandra 4.0 is community-hardened and tested by Apple, Amazon, Netflix, DataStax, Instaclustr, iland, and others, They have hardened and tested Cassandra 4.0 with clusters as big as 1,000 nodes and hundreds of real-world use cases and schemas.
During the testing and QA phase, the community created repeatable workloads that are as similar to real-world as feasible, while successfully checking the cluster status against the model without interrupting the workload.
Apple, Huawei, Netflix
Apache Cassandra is a NoSQL database that can manage huge quantities of data in load-intensive applications while maintaining high availability and avoiding single points of failure. Apple (more than 160,000 instances storing over 100 petabytes of data across 1,000+ clusters), Huawei (more than 30,000 instances across 300+ clusters), and Netflix (more than 10,000 instances storing 6 petabytes across 100+ clusters, with over 1 trillion requests per day), among many others, are among Cassandra’s largest production deployments.
Cassandra began as a Facebook project in 2008, moved to the Apache Incubator in January 2009, and was promoted to an Apache Top-Level Project in February 2010.
Apache Cassandra is in use at Activision, Apple, Backblaze, BazaarVoice, Best Buy, Bloomberg Engineering, CERN, Constant Contact, Comcast, DoorDash, eBay, Fidelity, GitHub, Hulu, ING, Instagram, Intuit, Macy’s, Macquarie Bank, Microsoft, McDonalds, Netflix, New York Times, Monzo, Outbrain, Pearson Education, Sky, Spotify, Target, Uber, Walmart, Yelp, and thousands of other companies that have large, active data sets.
“Netflix uses Apache Cassandra heavily to satisfy its ever-growing persistence needs on its mission to entertain the world. We have been experimenting and partially using the 4.0 beta in our environments and its features like Audit Logging and backpressure,” said Vinay Chella, Netflix Engineering Manager and Apache Cassandra Committer. “Apache Cassandra 4.0’s improved performance helps us reduce infrastructure costs. 4.0’s stability and correctness allow us to focus on building higher-level abstractions on top of data store compositions, which results in increased developer velocity and optimized data store access patterns. Apache Cassandra 4.0 is faster, secure, and enterprise-ready; I highly suggest giving it a try in your environments today.”