Description
Speaker:: Maria Mestre
Track: PyData: Natural Language Processing Data labelling is often considered a separate task that takes place before the real "machine learning work" happens, similar to waterfall software engineering practices. However this is typically a wrong approach that leads to failure of the whole project. In this talk, we will show how to use weak supervision techniques to not only label large amounts of data significantly faster than with other techniques, but to also protect your ML project from issues in the annotation step which can cause catastrophic errors further downstream.
Recorded at the PyConDE & PyData Berlin 2022 conference, April 11-13 2022. https://2022.pycon.de More details at the conference page: https://2022.pycon.de/program/LAUL7F Twitter: https://twitter.com/pydataberlin Twitter: https://twitter.com/pyconde