Cloudera, a provider of a platform for machine learning and advanced data analytics built on the latest open source technologies, has announced the release of Cloudera Altus – a Platform-as-a-Service (PaaS) offering that would make it easier to run large-scale data processing applications on public cloud.
The Cloudera Altus Data Engineering service would simplify the development and operations of elastic data pipelines; putting data engineering jobs front and center and abstracting infrastructure management and operations that can be both time consuming and complex. Altus would also reduce the risk associated with cloud migrations. It provides users with “familiar” tools packaged in an “open, unified, enterprise-grade” platform service that delivers common storage, metadata, security, and management across multiple data engineering applications.
Cloudera would make it easy and cost-effective to deploy these workloads on cloud providers – such as Amazon Web Services (AWS Cloud) – taking advantage of “cloud elasticity, low-cost storage and compute options, and rapid provisioning” to deliver a modern data service that can tackle even the most challenging business problems.
“Data and analytics, particularly in the cloud, continues to be one of the most significant areas of growth and investment for many enterprises,” said James Curtis, senior analyst, data platforms and analytics, 451 Research. “But organizations also faces challenges with cloud-based cluster management, data processing, and migration, which is right where Cloudera is focusing its efforts with Altus.”
The initial rollout of Cloudera Altus includes support for Apache Spark, Apache Hive on MapReduce2, and Hive on Spark. It is available today in most Amazon Web Services (AWS) regions. Over time Cloudera plans to expand Altus to support other leading public clouds such as Microsoft Azure, et cetera.
According to IDC, public cloud deployments are now at 12% of the overall worldwide business analytics software market and expected to grow at a 25% CAGR through 20201. Cloud is one of the fastest growing deployment environments for Cloudera customers, and Altus would make it easier than ever to run data engineering workloads in the cloud.
“Data engineering workloads are foundational for today’s data-driven applications,” said Charles Zedlewski, senior vice president of Products, Cloudera. “Altus simplifies the process of building and running elastic data pipelines while preserving portability and making it easy to incorporate data engineering elements into more complex BI, data science and real-time applications.”
Features and benefits of Altus would include:
- Managed service for elastic data pipelines – Cloudera Altus is a PaaS that allows data engineers to “easily and quickly” provision Apache Spark, Apache Hive, Hive on Spark, and MapReduce2 capacity on cloud-native infrastructure. Altus presents intelligent default cluster settings and environments that would significantly reduce cluster deployment times and operations, automating processes like cluster provisioning, configuring, and termination.
- Workload orientation – Cloudera Altus centers around data pipelines rather than clusters or infrastructures, so users can “easily submit, clone, and troubleshoot pipelines with minimal attention paid to the underlying infrastructure.”
- No data siloes – The Altus Data Engineering service enables data engineers to run direct reads from and writes to cloud object storage as does the rest of Cloudera’s platform. This data is immediately available for use by other Cloudera workloads without requiring data replication, ETL or changes to file formats.
- Backward compatibility and platform portability – Altus supports multiple versions of CDH – one of the most widely used open source platforms in the industry. Users can “easily” move workloads to and from the cloud without needing to modify their applications. Because CDH is backward compatible across minor releases, customers can harness the latest innovation from the Apache big data open source community “without fear of breaking their applications from release to release.”
- Built-in workload management – Altus would automate and simplify the common operational issues related to elastic data pipelines with workload management. Users can troubleshoot failed jobs with or without the clusters or compute infrastructure being present. In addition, Altus’ workload management would flag significant performance deviations and propose a root cause analysis.
“Altus gives us the ability to quickly and easily provision and deploy data engineering clusters on AWS, and enables our ETL developers to run their business-critical workloads without the hassle of ongoing cluster operations and management from CyberZ,” said Takahiro Moteki, Big Data Architect of F.O.X, CyberZ - a Cloudera customer. “We are also pleased to see that we can use the same enterprise technology stack in the cloud as is deployed on-premises to make our cloud migration that much easier.”