Description
The statistician John Tukey -- who designed the box plot and coined the term "bit" -- wrote: "An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem". Python has become one of the major languages for statistical data analysis, not least because of the expressiveness of the language itself and the availability of tools like Jupyter Notebooks, which enable iterative reasoning about a problem and its solutions.
This talks takes one step beyond an introduction to statistics with Python and aims to familiarize the audience with two concepts: a class of problems (so-called inverse problems), and a powerful statistical tool (the random walk, or more formally Markov-Chain Monte Carlo (MCMC) sampling with the Metropolis algorithm).
In inverse problems, model parameters are estimated from observational data. Both model and data are expected to be affected by error. The objective is not only to find parameters that best describe the observations, but also to figure out how good, or how possibly bad, a solution might be. Inverse problems are extremely common in many fields and crop up each time we attempt to reconstruct a reality from sensor, radar, scattering or imaging data.
The Metropololis-Hastings algorithm offers a solution via random sampling of a Bayesian posterior distribution. Even though listed as one of the 20th century's top 10 algorithms by the journal Computing in Science & Engineering, the Metropolis algorithm is easy to understand and implement, and a fun and instructive way to explore even complicated multi-variate probability distributions.