Cloudant Merges its Database Code Back into Open Source Apache CouchDB

Bit Ninja

Cloudant, provider of a globally distributed database-as-a-service (DBaaS), has delivered on its promise to merge core capabilities of its distributed database service back into the open source Apache CouchDB project.

CouchDB serves as the foundation of Cloudant’s technology stack in the form of BigCouch, an open source variant of CouchDB that the company developed to support large-scale, globally distributed applications. After four years of operating BigCouch in production, Cloudant has merged the BigCouch code back into the CouchDB codebase, making it possible to manage and replicate data with CouchDB at much larger scale.

For the code merger, Cloudant engineers imported sections of BigCouch code into the Apache CouchDB repositories, adapting the database to run in a clustered environment and to better replicate databases across clusters and between data centers. Going forward, Cloudant will cease development of BigCouch, in order to participate in the CouchDB community and keep CouchDB and Cloudant clustering functionality in sync. Cloudant engineers will continue to make cluster-scaling and fault-tolerance enhancements within the CouchDB project and will reuse that code in Cloudant’s database service.

BigCouch Clustering Capability

“We’re working to integrate BigCouch’s clustering technology with CouchDB,” said Jan Lehnardt, Vice President of Apache CouchDB. “We’ve set the stage and welcome more project committers to get involved. With Cloudant’s work to fine-tune BigCouch large-scale database replication, Apache CouchDB now has a complete strategy for replicating data across distributed systems, whether nodes are Erlang clusters in the same data center or on the other side of the world. Developers now have more options for moving data closer to their users and a simpler strategy for synchronizing that data throughout a larger system.”

The key accomplishment of the merged code is the BigCouch clustering capability. Among other improvements to Apache CouchDB, Cloudant has contributed a new compactor process that creates smaller and better-organized post-compaction databases. CouchDB users can now experience significant improvements in compaction and replication speed, as well as boosts in high-concurrency access performance. Additional improvements include: better index update speeds, updated aggregate reduce functions, smooth hot-code updates, improved logging, and streamlined libraries. Cloudant engineers also refactored internal code, removing complicated sections and boosting overall performance.

Adam Kocoloski, CTO at Cloudant

A preview of the merged software is available now, and a general release of CouchDB with the merged BigCouch functionality is targeted to be available following the Apache community release process.

“There are a lot of reasons people love CouchDB, like its elegant programming model, data durability, flexible indexing, and, most of all, its unique way of replicating and synching data across data centers or devices,” said Adam Kocoloski, co-founder and CTO at Cloudant. “We’re merging the horizontal scaling and fault-tolerance framework we built for BigCouch into CouchDB so people can more easily scale all that CouchDB goodness across multiple servers and keep it running nonstop. It’s our way of saying thanks and helping to grow the community of CouchDB developers and users.”