Contribute Media
A thank you to everyone who makes this possible: Read More

Writing highly scalable and provenanceable data pipelines

Description

In this talk we are gonna explore launching and maintaining highly scalable data pipelines using Kubernetes. We are gonna go through the process of setting up a Pachyderm cluster and deploying Python-based data processing workloads. This setup enables teams to develop and maintain very robust data pipelines, with the benefits of autoscaling clusters and quick code iteration.

Details

Improve this page