Contribute Media
A thank you to everyone who makes this possible: Read More

High Performance Data Processing in Python

Translations: en


numpy and numba are popular Python libraries for processing large quantities of data. When running complex transformations on large datasets, many developers fall into common pitfalls that kill the performance of these libraries. This talk explains how numpy/numba work under the hood and how they use vectorisation to process large amounts of data extremely quickly. We use these tools to reduce the processing time of a dataset from 3 years to 12 hours, even when the code is run on a single Macbook Pro.


Improve this page