Summary
Statistical analysis comes in two main flavors: frequentist and Bayesian. The subtle differences between the two can lead to widely divergent approaches to common data analysis tasks. After a brief discussion of the philosophical distinctions between the views, I’ll utilize well-known Python libraries to demonstrate how this philosophy affects practical approaches to several common analysis tasks.
Description
In scientific data mining and machine learning, a fundamental division is that of the frequentist and Bayesian approaches to statistics. Often the fodder for impassioned debate among statisticians and other practitioners, the subtle philosophical differences between the two camps can lead to surprisingly different practical approaches to the analysis of scientific data.
In this talk I will delve into both the philosophical and practical aspects of Bayesian and frequentist approaches, drawing from a series of posts from my blog.
I'll start by addressing the philosophical differences between frequentism and Bayesianism, which boil down to different definitions of probability. I'll next move briefly into the mathematical details behind the two approaches, at a level which will be informative to a general scientific audience. I'll then show some examples of the two approaches applied to some increasingly more complicated problems using standard Python packages, namely: NumPy, SciPy, Matplotlib, and emcee.
With this combination of philosophy and practical examples, the audience should walk away with a much better understanding of the differences between frequentist and Bayesian approaches to statistical analysis, and especially how the philosophy of each approach affects the practical aspects of computation in data-intensive scientific research.