Skip to content

DatasetOverview

Mark A. Greenwood edited this page Nov 26, 2025 · 3 revisions

Dataset Overview

This tab provides an overview of the entire dataset (unless filters have been selected), and includes both abusive and non-abusive messages. Its purpose is to give an overall feel for the online traffic relative to the person or people being monitored (targets).

Statistical Overview

The first panel provides some general statistics about the dataset, both as a whole and broken down by platform.

Statistical overview of the full dataset

Timeline

Next, a timeline of messages is displayed, colour coded by platform.

Full timeline of the whole dataset

The mouse can be used to select a date range to view in more detail. Note that this manoeuvre does not affect the rest of the dashboard unless the button “Update to Selected dates” is clicked.

Section of timeline for the whole dataset

Clicking on one of the platform names in the legend will deselect it from the visualisation. This does not affect the rest of the dashboard. To filter the dashboard by a platform (i.e. to show only posts from that platform), use the Filter option at the top of the dashboard.

Full timeline of the whole dataset

Hashtags

This panel shows the most frequent hashtags that appear in the dataset.

Hashtag cloud for the whole dataset

Hovering over a hashtag will display the number of occurrences in the data, while clicking on it will add it to the main filter. “Update” must be clicked before the filter is applied to the data.

Topics

The topic panel depicts a bar chart of the most frequent topics found in the data, using an automated topic classifier with pre-set topics. More information about the classifier (originally designed for news framing) can be found here. Hovering over any bar in the chart displays the frequency for that topic.

Topics seen across the whole dataset

Languages

This panel displays a pie chart and table depicting the proportion of messages in each language. Languages are identified using an automated language classifier, Clicking on a language in the table adds it to the dashboard filter. Messages can be classified as Unknown for various reasons, and are identified separately. “Unknown - Media Links only” means that the message contains just a URL, so no language is present, effectively. “Unknown - Mentions only” means that the message contains only the mention of someone’s username, so again, no language is present, effectively. Finally, the classifier may just not be able to identify the language for other reasons, in which case it is just classified as Unknown.

Language break down across the whole dataset

Posts

The final panel on this tab shows all the posts, sorted by date. An option enables the sort order to be changed from ascending to descending. By default, both original messages and replies are displayed - checkboxes enable either of these to be deselected.

Clone this wiki locally