Did you know you can use #AI to analyse big datasets of news articles to determine how the media is covering a certain topic?

A thread on what my research assistant @JoyceOoops and I learned about #Uganda's #Covid19 #media coverage by applying #LDATopicModelling #MachineLearning
During the Covid-19 lockdown, Whitehead Comm built out our research team to learn & test out new methods to see the bigger picture & inform smarter comms strategy.

@JoyceOoops began applying her Python skills & super curiosity to Natural Language Processing & LDA Topic Modelling
We decided to apply this new machine learning media analysis technique to tracking how Covid-19 was covered by Ugandan online news media.

We began by scraping all articles including keyword (stems) "covid" and "corona" from 13 Ugandan news sites, collecting over 13,000 articles.
Our purpose was 2-fold:

1. Test this new technique & learn how we can apply it to our work (this is experimental research);

2. Gather findings to correlate with results of other methodology & identify determinants of public opinion.

Our first finding: coverage peaked in April.
(I'm tweeting from @bletchleypark, where I've gone as a kind of pilgrimage today to check out where my grandfather worked as a code breaker. Let me get the most out of this place before it closes and then come back to finish up this thread.)
Back to our Topic Modelling results...

Data we scraped from Uganda's top online news sites including keywords "covid" and "corona" suggested that #Covid19UG coverage peaked during the initial lockdown period in April, 2020, after which articles on the subject began to reduce.
After some data cleaning, we ran an algorithm many times using different parameters to analyse the thousands of articles we'd collected on #Covid19 from Ugandan online news. The model applied Natural Language Processing (NLP) to identify topics. We settled on 16 as most optimal.
It was interesting to see how the LDA Topic Model picked up on uniquely Ugandan Covid-19 news trends.

Ex.

Articles about the domestic outbreak and government response peaked around the time of the 1st declared case;

Reports of cases and testing featured a lot of truck drivers.
More topics our model identified that were so UG:

The courts & justice topic 1st peaked when courts adapted to lockdown, then again when Bad Black threatened to sue @MinofHealthUG;

Electoral politics emerged as a Covid topic in mid-June w/ introduction of "scientific election".
Curious about how we did this? Here's our methodology.👇

Like it says in my Twitter bio, we're learning as we go. LDA Topic Modelling isn't a perfect method for media analysis, but it's pretty cool!

We're open to further collaboration.🤓
Email: [email protected]
The full report on Whitehead Comm's LDA Topic Modelling of Covid-19 Media Coverage in Uganda is available for download on our website.

(Apologies, if you are in Uganda you may need to use VPN to access http://whiteheadcommunications.com )

👇
http://www.whiteheadcommunications.com/newsletter/lda-topic-modelling-of-covid-19-media-coverage-in-uganda
You can follow @WhiteheadComm.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: