This post is an updated version of the previous post, with the added data from more recent labeled news event excerpts. Also, this visualization contains only the excerpts of 5 sentences or less to keep the document sizes consistent.

The news events included in this visualization are listed in the following list. These events can also be analyzed individually, by clicking the links.






San Bernardino

Isla Vista

Text Visualizations Explanation

This file will display visualizations of the text based on the labelled categories, shown as the circles on the distance plot. This plot also shows the word distributions associated with each category. The word distributions on the right show the most common words in each category when lambda=1, and the most specific words to the category when lambda = 0, computed by the relevance metric.

The categories are labelled on the plot as numbers, and the corresponding label titles are:

	Topic 1: POLICY, number of words: 178427
	Topic 2: EVENT, number of words: 185512
	Topic 3: VICTIMS, number of words: 132444
	Topic 4: ACCOUNT, number of words: 139529
	Topic 5: MOURNING, number of words: 69965
	Topic 6: SAFETY, number of words: 71556
	Topic 7: GRIEF, number of words: 66832
	Topic 8: PERPETRATOR, number of words: 68053
	Topic 9: INVESTIGATION, number of words: 43818
	Topic 10: SOCIALSUPPORT, number of words: 43542
	Topic 11: TRAUMA, number of words: 53066
	Topic 12: RESOURCES, number of words: 37341
	Topic 13: PHOTO, number of words: 31942
	Topic 14: RACECULTURE, number of words: 27025
	Topic 15: LEGAL, number of words: 24346
	Topic 16: MEDIA, number of words: 22273
	Topic 17: THREAT, number of words: 16626
	Topic 18: JOURNEY, number of words: 24483
	Topic 19: MISCELLANEOUS, number of words: 12121
	Topic 20: HERO, number of words: 5923

The size of the circles correspond to the size of that category. Also, if hovering over a word in the chart on the right, the size of the circles will adjust proportional to count of that word in each category. Clicking on a topic will display that topic word distribution, and clicking away on the empty part of the distance plot will show the overall word distribution of all the documents.