Description
Pandas is a fast and expressive library for data analysis that doesn’t naturally scale to more data than can fit in memory. PySpark is the Python API for Apache Spark that is designed to scale to huge amounts of data but lacks the natural expressiveness of Pandas. We will introduce Sparkling Pandas, a new library that brings together the best features of Pandas and PySpark; Expressiveness, speed, and scalability.