Description
Data preprocessing and feature exploration are crucial steps in a modeling workflow. In this tutorial, I will demonstrate how to use Python libraries such as scikit-learn, statsmodels, and matplotlib to perform pre-modeling steps. Topics that will be covered include: missing values, variable types, outlier detection, multicollinearity, interaction terms, and visualizing variable distributions. Finally, I will show the impact of utilizing these techniques on model performance. Interactive Jupyter notebooks will be provided.