Description
Ever wonder if 10,000 hours is really the baseline? Did some set of events make the difference in someone getting to the olympics? WorldRowing has captured data on races and athletes for decades but little has been done to analyze the data across countries, athletes and races to the possible outcomes and where investment might be best to identify impact to rowing at a high performance level. Data for each race, for each athlete is stored on the WorldRowing website. Varying amounts of personal information, plus race information, is available for athletes spanning several decades. This talk investigates the athlete race data from WorldRowing.com and demonstrates an end to end walk through of a data analysis problem using Python.
Introduction/Problem Statement
- World Rowing Athlete Database
- Description of the data
Data Analysis
- Scraping the data
- what to get, how to get it
- Data Munging and Wrangling
- Analysis
- Does the data show correlations?
Takeaways
- How to identify a problem for data analysis
- Dealing with inconsistent data
- Drawing conclusions from that dataset