Track: General: Ethics Synthetic datasets have caught the fancy of researchers, statisticians, and analysts. Also called fake or proxy data, not only does it address the privacy needs of the data subjects but also offers a workaround in case of unprecedented situations. Take the example of clinical data requirements during the SARS-Cov-2 pandemic. This talk introduces the concept of synthetic data to the audience who is curious about the hype surrounding it and see themselves using it in future. Apart from the appreciation of synthetic datasets and their different types, we would also see how the realness of such Frankenstein datasets is gauged. I would also discuss the options that are available for their generation, and how you do not need to be a mad scientist to make realistic synthetic datasets.
Recorded at the PyConDE & PyData Berlin 2022 conference, April 11-13 2022. https://2022.pycon.de More details at the conference page: https://2022.pycon.de/program/3VAZ7R Twitter: https://twitter.com/pydataberlin Twitter: https://twitter.com/pyconde