AWS And Facebook Launch PyTorch Tools
Written by Alex Denham   
Friday, 05 June 2020

Two new tools have been released for PyTorch, the open source library for deep learning. Both are collaborations between Amazon AWS and Facebook. TorchServe is a PyTorch model serving library, while the TorchElastic Controller for Kubernetes adds Kubernetes support to TorchElastic, a library for fault-tolerant and elastic training in PyTorch.

PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. It aims to offer a replacement for NumPy that make suse of the power of GPUs, while providing a deep learning research platform that provides maximum flexibility and speed.


TorchServe aims to provide a clean, well supported, and industrial-grade path to deploying PyTorch models for inference at scale without having to write custom code. TorchServe provides a low latency prediction API, and also embeds default handlers for the most common applications such as object detection and text classification. It also includes multi-model serving, model versioning for A/B testing, monitoring metrics, and RESTful endpoints for application integration.

The Kubernetes Controller with TorchElastic integration gives PyTorch developers a way to train machine learning models on a cluster of compute nodes that can dynamically change without disrupting the model training process. If a node goes down, TorchElastic can pause node level training and resume once the node is healthy again. By using the Kubernetes controller with TorchElastic, distributed training jobs can be run on clusters with nodes that get replaced, either due to hardware issues or node reclamation. This means developers can create training systems that can work on large distributed Kubernetes clusters that include cheaper spot instances. Such instances can vary significantly depending on how many unused EC2 instances are available, and are liable to interruption, which would cause problems with traditional machine learning training frameworks.


More Information


TorchElastic Controller For Kubernetes

Related Articles

PyTorch Adds TorchScript API

PyTorch Scholarship Challenge

Microsoft Cognitive Toolkit Version 2.0

Microsoft Open Sources Natural Language Processing Tool

Microsoft Open Sources AI Debugging Tool

More AI Tools From Microsoft


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Rust Fast And Safe

Rust is one of the few innovative languages threatening to shake up the old order, but is it really so good? A new research survey suggests that it really does seem to be both safer and faster, which  [ ... ]

pg_ivm - Materialised Views On Steroids

pg_ivm is an extension module for PostgreSQL 14 that provides an Incremental View Maintenance (IVM) feature.That means that materialized views are updated immediately after a base table is modified.

More News





or email your comment to: