Scaling Machine Learning jobs with Kubernetes

YouTube

Description

Running Machine learning jobs at scale places painful demands on infrastructure from an operational perspective. As the number of jobs increase, having an easy-to-use infrastructure becomes a necessity. In this talk we will cover how we use Kubernetes at Textkernel as a job manager to scale our Tensorflow-based jobs. We will also explore other solutions such as distributed Tensorflow and Kubeflow.

PyVideo

Scaling Machine Learning jobs with Kubernetes

Description

Details