Temporal evolution of the twitter activity and event detection
In this section we will show the trends regarding news and events of the tweets collected from July 2020
to June 2021 about contact tracing apps.
The idea is to detect events during the relevant activity in twitter and for each event try to
understand in an automatic way what happened.
Reusing the sentiment orientation (positive/negative) of the tweets, we have extracted the temporal
evolution of both positive and negative events.
Furthermore, we performed a cross check between pages linked in tweets and the GDELT database to
identify relevant news items.
Combining the relevant news and the number of re-tweets, we have ranked the events and discarded the
least interesting ones.
In addition, we identified the top hashtags and the countries involved in each event.
With this information we tried to understand the main events regarding the contact tracing apps and the
thoughts of the twitter users.
Figure 1 – Time series of the tweets mentioning the six most important contact tracing apps and
potential event identification.
Figure 1 shows twitter users’ activity relevant to the six most important mobile contact tracing apps.
The red circles denote potential events that have been detected using the z-score function output.
The z-score value of a (potential) event detected is calculated according to the following formula:
Only potential events with a z-score value greater than 1 are displayed in Figure 1, and the size of
the circle is proportional to the z-score value.
Table 1 - Events detected using the z-score function.
Table 1 displays the events detected sorted by the z-score function.
Also, the table reports for each event the number of tweets and the main 3 contact tracing mobile
apps.
Event detection based on tweets with opinion
In this section we focus on positive and negative events, taking advantage of the identification of the
opinions
(positive/negative) expressed in the relevant tweets that took place in the Sentiment analysis section.
This way, we extract from the tweet time series of Figure 1 only the twitter activity that involves
positive tweets, to identify potential positive events.
Figure 2 - Time series of the positive tweets mentioning the six most important contact tracing apps
and potential event identification.
Figure 2 shows the temporal evolution of positive tweets, and we can observe that the events with the
highest z-score value took place during the second wave of the pandemic (autumn 2020).
Furthermore, the event on Sept 24, 2020 has the highest z-score value (9.6), which means that the number
of tweets is higher than the mean average.
For the rest of the events displayed in Table 2, we have a z-score value between 5.7 and 2.2.
Table 2 - Events detected using the z-score function on tweets with positive opinions mentioning the
six most important contact tracing apps.
Figure 3 - Time series of the negative tweets mentioning the six most important contact tracing apps
and potential event identification.
As is shown in Figure 3, the landscape of the potential negative events is a little different from that
of the positive ones:
The z-score for the top 10 events is almost linear in the range 5 - 3 (see Table 3 for details), and the
events did not take place only during the second wave.
Table 3 - Events detected using the z-score function on tweets with negative opinions mentioning the
six most important contact tracing apps.
Enhancing event detection with news and re-tweet
In the previous section we identified potential events based on the z-score and the number of tweets;
an outcome of this analysis is Table 4, where we present the potential events with a z-score greater than
4.5,
together with statistics on the news linked in the relevant tweets and the total number of retweets.
This additional information allows us to identify more clearly the importance of the events, since the total
number of retweets denotes how much the news has been shared.
Table 4 - Interactive table that allows ordering the potentially important events detected by number of
relevant news items and number of retweets.
Positive event on 24 Sept, 2020 - Lounch of the NHS Covid-19 Apps
The first potential event, identified as such from the number of initial tweets with links to news items (see Table 4), was a positive event on 24 September 2020.
Figure 5 - Density of the geolocalised tweets (geo-tweets) and wordcloud of the top hastags.
To get a better understanding of this event, in Figure 2 we present a geographical visualization (map) combined with a hashtag word cloud for this event.
The map shows from where the tweets have been posted, and the shading color is darker analogously to the tweet density.
It is obvious there that most of the tweets have been posted in the UK and from the hashtag word cloud,
which presents the hashtags extracted using their relative frequency, we understand that they were related to the contact tracing mobile app NHS coronavirus APP.
We can observe in the word cloud that the hashtag haveyoudownloaded is about the launch of the NHS coronavirus APP.
Table 5 presents the pages linked to tweets and that correspond to news items from the GDELT database, ordered by the number of retweets.
Table 5 - combining the linked pages on tweets and the GDELT news sorted by the number of retweets. For this date there were 16 EMM news items.
Negative event on 24 September, 2020
The second potential event, in terms of number of initial tweets with links to news items (see Table 4), was a negative event on 24 September 2020.
Figure 6 - Density of the geolocalised tweets (geo-tweets) and wordcloud of the top hastags.
To get a better understanding of this event, in Figure 6 we present a geographical visualization (map) combined with a hashtag word cloud for this event.
The map shows from where the tweets have been posted, and the shading color is darker analogously to the tweet density.
It is obvious there that most of the tweets have been posted in UK and France and from the hashtag word cloud, which presents the hashtags extracted using their relative frequency, we understand that they were related to politicians (Valp) and the contact tracing mobile app NHSCovid-19.
We can observe in the word cloud that there are very negative hashtags. Table 6 presents the pages linked to tweets and that correspond to news items from the GDELT database, ordered by the number of retweets. There are two types of news linked in the tweets, the first one about personal protective equipment and the second about the NHSCovid-19 app.
Table 6 - combining the linked pages on tweets and the GDELT news sorted by the number of retweets. For this date there were 16 EMM news items.
Negative event on 26 September, 2020
The third potential event, in terms of number of initial tweets with links to news items (see Table 4), was a negative event on 26 September 2020.
Figure 7 - Density of the geolocalised tweets (geo-tweets) and wordcloud of the top hastags.
To get a better understanding of this event, in Figure 7 we present a geographical visualization (map) combined with a hashtag word cloud for this event.
The map shows from where the tweets have been posted, and the shading color is darker analogously to the tweet density. It is obvious there that most of the tweets have been posted in France and from the hashtag word cloud, which presents the hashtags extracted using their relative frequency, we understand that they were related to politicians (Macron, Castex) and the contact tracing mobile app NHSCovid-19. We can observe in the word cloud that the attention is toward the mobile app NHSCovid-19 and the relative potential problems. Furthermore, there seems to exist a relation with other mobile contact tracing apps such as Covid Ireland and Radarcovid, most probably due to the ways the users add tags in their tweets. Table 7 presents the pages linked to tweets and that correspond to news items from the GDELT database, ordered by the number of retweets.
Table 7 - combining the linked pages on tweets and the GDELT news sorted by the number of retweets. For this date there were 0 EMM news items.
Negative event on 12 October, 2020
The fourth potential event, in terms of number of initial tweets with links to news items (see Table 4), was a negative event on 12 October 2020
Figure 8 - Density of the geolocalised tweets (geo-tweets) and wordcloud of the top hastags.
To get a better understanding of this event, in Figure 8 we present a geographical visualization (map) combined with a hashtag word cloud for this event.
The map shows from where the tweets have been posted, and the shading color is darker analogously to the tweet density. It is clear there that most of the tweets have been posted in France and, at a second place, in Italy. From the hashtag word cloud, which presents the hashtags extracted using their relative frequency, we understand that they were related to politicians in France (Castex) and the contact tracing mobile app Immuni in Italy. Table 8 presents the pages linked to tweets and that correspond to news items from the GDELT database, ordered by the number of retweets. We see there that there is only one news item, relating to face masks and how they impact the litter increase in Ireland.
Table 8 - combining the linked pages on tweets and the GDELT news sorted by the number of retweets. For this date there were 1 EMM news items.