Description
Scattertext is a Python package that lets you compare and contrast how words are used differently in two types of documents, producing interactive, Javascript-based visualizations that can easily be embedded into Jupyter Notebooks. Using spaCy and Empath, Scattertext can also show how emotional states and words relating to a particular topic differ.
Abstract
Notebooks and presentation for this talk are available from https://github.com/JasonKessler/Scattertext-PyData.
Motivation and introduction
- What's the matter with word clouds?
- How to read a plot made by Scattertext
How to make your own plots
- Preparing a Pandas data frame with your data set
- Plotting with Scattertext, and fine tuning plots for interpretability and speed
Scattertext and the Python NLP ecosystem
- Visualizing emotions using Empath.
- Using word vectors from spaCy and elsewhere see how topic-specific language differs.
- Visualizing topic models from scikit-learn.
Links
- Source code for the package is hosted on Github at https://github.com/JasonKessler/scattertext
- For more information, please see the paper which will appear as a 2017 ACL Demo at https://arxiv.org/abs/1703.00565