Newsroom Textual Analysis and Visualization Tools Built With R Shiny

Aleszu Bajak, John Wihbey, Gibby Free, Paxtyn Merten

While word cloud visualizations and similar types of simple tools are widely available on the web, the more sophisticated textual analysis software and code unfortunately remain the domain of experts and users of languages like R and Python. To address this, we created two prototypes that would allow newsrooms to harness the power of R’s textual and sentiment analysis packages in a simple drag-and-drop format through public-facing R Shiny apps. These two apps show how powerful data wrangling, analysis and visualization functions may be increasingly democratized for time-pressed journalists. We demonstrate use cases through: 1) exploratory analysis of public speeches; 2) exploration of sentiment analysis around large bodies of political advertising, such as the new archive on the Facebook platform; and 3) comparative analysis of one outlet’s coverage on a specific topic as compared to that of the competition.

R is an open-source statistical software environment with a robust community of developers that collaborate and share knowledge, tools and code.1 Newsrooms are increasingly adopting R. Currently, R has statistical and textual analysis packages that allow for rich and generative forms of interpretation. In addition, R Shiny, which deploys code and datasets to a server, is easily customizable and user-friendly; it allows for a text-and-numerical dataset to be dropped in and a set of questions to be asked of it, facilitated by a pre-coded operational workflow.

These apps would allow reporters to drop in spreadsheets or text documents of interest and ask a series of cross-cutting questions. In this project, we have customized R’s “tidytext” and “plotly” packages and UVM’s “LabMTsimple” sentiment dictionary for textual and sentiment analysis and added several intervention points for users to subset the data, ask it guided questions and produce exploratory data visualizations.

We consider the possibilities enabled by our applications by looking at two use cases. Our first use case with a prototype Shiny app involves looking at a corpus of texts or headlines such as the public speeches of a politician or coverage of a news topic to understand rhetorical patterns, popular formulations and sentiment (See Figure 1). Techniques such as descriptive statistics, sentiment analysis and n-gram analysis are employed. Our second use case with a prototype Shiny app involves discerning patterns in political advertisements on Facebook, allowing for unique insights into the strategies employed in political messaging (See Figure 2). Techniques such as textual and sentiment analysis are employed.

These prototypes point to the possibility of a broader ecosystem of similar deadline-friendly apps for newsrooms that could provide them with greater analytical power and higher-level insights.

This paper was presented at the 2019 Computation + Journalism conference at the University of Miami.

References

[1] “The R Project for Statistical Computing” Retrieved from https://cran.r-project.org/

[2] Robinson, D. and Silge, J. 2018. “tidytext: Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools” Retrieved from https://cran.r-project.org/web/packages/tidytext/index.html

[3] Ropensci. 2018. “Plotly: an interactive graphing library for R” Retrieved from https://github.com/ropensci/plotly

[4] Reagan, A. 2018. “labMTsimple Documentation” Retrieved from https://media.readthedocs.org/pdf/labmt-simple/latest/labmt-simple.pdf

Demo the Shiny apps

Inspect TXT files

Click here for the TXT Shiny app.

Inspect CSV files

Click here for the CSV Shiny app.

Inspect Facebook political ads

Click here for the Facebook ads Shiny app.

Demo datasets for these Shiny apps

Google Sheet of 20,000+ r/politics Reddit posts from 7/21/17 to 10/12/18.

CSV of Trump tweets through Nov. 2016.

CSV of women running in House and Senate races 2018.

CSV of 1 year of Washington Post Politics headlines.

CSV of 1 year of New York Times Politics headlines.

CSV of three months of Heidi Heitkamp articles.

CSV of six months of Lisa Murkowski articles.

TXT of Trump campaign speeches.

TXT of 2018 Obama rally in Illinois.

TXT of Trump State of the Union speech.

ProPublica's database of Facebook political ads. And find the full dataset here.

Bonus apps

Storybench mapmaker

Click here for the mapmaker Shiny app.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
csvanalysis		csvanalysis
fbads		fbads
mapmaker		mapmaker
shiny		shiny
textanalysis		textanalysis
Heitkamp-articles-3-months.csv		Heitkamp-articles-3-months.csv
MidtermMayhem.pdf		MidtermMayhem.pdf
README.md		README.md
labMT2english.csv		labMT2english.csv
murkowski.csv		murkowski.csv
nyt-articles.csv		nyt-articles.csv
obama-rally-2018.txt		obama-rally-2018.txt
r-politics-three-months.csv		r-politics-three-months.csv
school-shootings-since-2015-wapo.csv		school-shootings-since-2015-wapo.csv
trump_state_union_2018.txt		trump_state_union_2018.txt
trump_un-2018.txt		trump_un-2018.txt
trumpspeeches.txt		trumpspeeches.txt
trumptweets-nov16.csv		trumptweets-nov16.csv
wapo-articles.csv		wapo-articles.csv
women_running.csv		women_running.csv
women_running_counts.csv		women_running_counts.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Newsroom Textual Analysis and Visualization Tools Built With R Shiny

References

Demo the Shiny apps

Inspect TXT files

Inspect CSV files

Inspect Facebook political ads

Demo datasets for these Shiny apps

Bonus apps

Storybench mapmaker

About

Releases

Packages

Languages

aleszu/textanalysis-shiny

Folders and files

Latest commit

History

Repository files navigation

Newsroom Textual Analysis and Visualization Tools Built With R Shiny

References

Demo the Shiny apps

Inspect TXT files

Inspect CSV files

Inspect Facebook political ads

Demo datasets for these Shiny apps

Bonus apps

Storybench mapmaker

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages