The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 open source projects and initiatives, has announced the release of Apache Hadoop v3.0.0 – the latest version of the open-source software framework for scalable, distributed computing.

“This latest release unlocks several years of development from the Apache community,” said Chris Douglas, Vice President of Apache Hadoop. “The platform continues to evolve with hardware trends and to accommodate new workloads beyond batch analytics, particularly real-time queries and long-running services. At the same time, our Open Source contributors have adapted Apache Hadoop to a wide range of deployment environments, including the cloud.”

Apache Hadoop 3.0.0 highlights include

  • HDFS erasure coding – would halve the storage cost of HDFS while also improving data durability
  • YARN Timeline Service v.2 (preview) – would improve the scalability, reliability, and usability of the Timeline Service
  • YARN resource types – enables scheduling of additional resources, such as disks and GPUs, for better integration with machine learning and container workloads;
  • Federation of YARN and HDFS sub-clusters transparently scale Hadoop to tens of thousands of machines
  • Opportunistic container execution would improve resource utilization and increase task throughput for short-lived containers. In addition to its traditional, central scheduler, YARN also supports distributed scheduling of opportunistic containers
  • Improved capabilities and performance improvements for cloud storage systems – such as Amazon S3 (S3Guard), Microsoft Azure Data Lake, and Aliyun Object Storage System.

Hadoop 3 is a major milestone for the project, and our biggest release ever,” said Andrew Wang, Apache Hadoop 3 release manager. “It represents the combined efforts of hundreds of contributors over the five years since Hadoop 2. I’m looking forward to how our users will benefit from new features in the release that improve the efficiency, scalability, and reliability of the platform.”

Hadoop 3.0.0 has already undergone extensive testing and integration with the broader Open Source ecosystem at The Apache Software Foundation. With this release, its community of developers and users promote this release series out of beta.

Alibaba, Rackspace, LeaseWeb

Apache Hadoop is widely deployed at numerous enterprises and institutions worldwide, such as Rackspace, Alibaba, Teradata, Amazon Web Services (AWS), AOL, Tencent, Uber, Apple, Cloudera, Adobe, eBay, Facebook, foursquare, Google, Hortonworks, HP, IBM, Intel, LinkedIn, Microsoft, Netflix, The New York Times, SAP, Tesla Motors, Twitter, and Yahoo.

The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, Union Investment, WANdisco, and Yahoo.