Description
This talk demonstrates how to scale a Python-based machine learning workflow to larger models and larger datasets. The talk will introduce a common workflow using NumPy, pandas, and scikit-learn, and discuss some challenges with scaling that workflow out to larger datasets. We'll then see how dask and dask-ml work with and extend these libraries to enable large-scale parallel and distributed machine learning.Presenter(s): Speaker: Tom Augspurger, Anaconda, Inc. Speaker: Olivier Grisel, INRIA