Anomaly Detection

YouTube

Description

aka. How to Convince Your Boss that you need Machine Learning

This presentation comes from a recurring question I get from my students: "OK, all this machine learning stuff is great but how do you use it in real life?"

The students ask this because they come from what several of us would call old-fashioned programming jobs, where you actually program every step of what a computer does with the data. Sounds crazy, right? But that is still the reality in several companies. There are several reasons for dealing with data in that way: legacy applications, belief that hand-coded rules get 100% accuracy (yeah, right), or simply an attitude of don't fix what ain't broken - despite the fact that it is broken.

I point such students towards Anomaly Detection (AD). Why? Because AD is an easy technique to plug into a non-ML system. Every company has some form of data aggregation - and data scrutiny - system: be it webserver logs, forex trades, or hotel bookings data. In that scrutiny system dirty data exist, or is fed in, and there are coded rules to prevent processing of this bad data.

Instead of hand-coded rules, that is the place where ML techniques should be used, notably AD.

AD is quite easy to explain without going into mathematics, which is a good thing if you need to convince your boss. AD can be used as a substitute for hand-coded rules, as a way of tuning (hyper-)parameters for rules, or even working alongside such rules. Moreover most AD techniques can be used on both: static datasets or running time series.

We will discuss a couple of examples of AD use. AD may appear as a rather specific field to a typical programmer but that's far from true. AD is just a clever (read: slightly different) way of using well known ML techniques.

PyVideo

Anomaly Detection

Description

Details