Reproducible Machine Learning

YouTube

Description

Reproducibility is a cornerstone of scientific methods. Especially in production Machine Learning it's crucial to ensure that hidden source of randomness is not a real reason for a model performance improvement. In my talk I will elaborate on importance of reproducibility and show how we build reproducible machine learning pipelines at Netguru.

Reproducibility is a cornerstone of scientific methods. Especially in production Machine Learning it's crucial to ensure that hidden source of randomness is not a real reason for a model performance improvement. Although, reproducibility in building machine learning papers seems to be must-have, it's still not a standard.

Outline of talk:

Definitions:
- reproducibility
- replicability
- generalisability
Motivation for achieving reproducibility
Full reproducibility == Continuous Delivery for ML
Changes in ML development process
- code
- data
- models
How we managing change in ML development process?
Data versioning
- Quilt Data
Experiments management
- MLFlow / Polyaxon
Summary

PyVideo

Reproducible Machine Learning

Description

Details