Contribute Media
A thank you to everyone who makes this possible: Read More

A new approach to DataFrames and pipelines

Description

Modern Data Science: A new approach to DataFrames and pipelines Maarten Breddels, Jovan Veljanoski Audience level:

We show how to deal with massive datasets using small resources using the Python Vaex DataFrame library. Using computational graphs, efficient algorithms and storage (Apache Arrow / hdf5) Vaex can easily handle up to a billion rows, even on your laptop. As a bonus, Vaex can automatically generate a Machine Learning pipeline using the graph structure build-up internally in the DataFrame.

Details

Improve this page