In machine learning, training large models on massive amount of data usually improved results. Our customers report, however, that training such models and deploying them is either operationally prohibitive or outright impossible for them. Amazon AI Algorithms is designed to solve this problem. It is a collection of distributed streaming ML algorithms that scale to any amount of data. They are fast and efficient because they distribute across CPU/GPU machines and share a collective distributed state via a highly-optimized parameter server. They scale to an infinite amount of data because they operate in the streaming model. This means they require only one pass over the data and never increase their resources consumption, allowing training to be paused, resumed, and snapshotted and even for algorithms to consume kinesis streams directly providing an “always on” training mechanism. They are production ready. Trained models are automatically containerized and useable in production using Amazon SageMaker hosting. Finally, we provide a convenient SDK which allows scientists to create new algorithms which operate in this model and enjoy all the benefits above.
This talk will discuss our design choices and some of the internal working of the system. It will also describe the distributed streaming model and its numerous benefits to machine learning practitioners. We will show how to invoke large scale learning from Amazon SageMaker, or Amazon EMR, and host the solution. Time permits, we will show how to develop a new Algorithm using the SDK.
Publisher: Amazon Web Services
You can watch this video also at the source.