Can anyone point me to a published paper or any of the research mentioned in this article? There& #39;s no link here, no link in an MIT Tech Review article, nothing in several other articles I& #39;ve seen. @CMU_CASOS can you help me out here? I want to review your methodology. https://twitter.com/NPR/status/1263400594338918401">https://twitter.com/NPR/statu...
The thing I normally do when I see an article like this is click through to the academic research being cited but....... I cannot find the academic research being cited
It seems like all the news articles are based on this press release, which itself does not link to any published or unpublished research: https://www.scs.cmu.edu/news/nearly-half-twitter-accounts-discussing-%E2%80%98reopening-america%E2%80%99-may-be-bots
When">https://www.scs.cmu.edu/news/near... I go to the publications page of the lab, I can& #39;t find anything from 2020 about bots. http://www.casos.cs.cmu.edu/publications/index.php">https://www.casos.cs.cmu.edu/publicati...
When">https://www.scs.cmu.edu/news/near... I go to the publications page of the lab, I can& #39;t find anything from 2020 about bots. http://www.casos.cs.cmu.edu/publications/index.php">https://www.casos.cs.cmu.edu/publicati...
I suppose that for now I have to assume the lab is using the same methodology outlined in this 2019 paper looking at the role of bots in activist hashtags in the Asia-Pacific region (PDF): http://www.casos.cs.cmu.edu/events/summer_institute/2019/si_portal/pubs/Uyheng%20-%20Characterizing%20Bot%20Networks.pdf">https://www.casos.cs.cmu.edu/events/su...
Unfortunately this presents its bot detection methodology as a handwavey machine learning black box, based on a training data set that itself isn& #39;t auditable, and with a threshold of 60% probability-you-are-a-bot being their cutoff for comfortably declaring an account a bot
I found a couple of posters that share the same Office of Naval Research funding award numbers as the Asia-Pacific bot paper that detail some machine learning approaches to social network analysis, possibly related:
http://www.casos.cs.cmu.edu/events/summer_institute/2019/si_portal/posters/poster-Binxuan%201.pdf">https://www.casos.cs.cmu.edu/events/su...
http://www.casos.cs.cmu.edu/events/summer_institute/2019/si_portal/posters/poster-Binxuan%201.pdf">https://www.casos.cs.cmu.edu/events/su...
Ah here we go, this is the paper that describes Bothunter, the algorithm described in the Asia-Pacific paper, which again I am *assuming* is what was used in the research referred to in the NPR article (which again, has not been published)
http://www.casos.cs.cmu.edu/publications/papers/LB_5.pdf">https://www.casos.cs.cmu.edu/publicati...
http://www.casos.cs.cmu.edu/publications/papers/LB_5.pdf">https://www.casos.cs.cmu.edu/publicati...
Oh no. This paper is.... not very good in my opinion. It& #39;s 8 pages long, about 3 pages of which is the actual research, and those sections (1, 2, and 3) don& #39;t give hardly any auditable information. What they do lay out is, well, not what I would do if I were running a bot study
So to train their model they need known bots acocunts. Instead of attempting to attract bots, or looking at accounts that were suspended for bot activity, they picked a single "known and publicized" bot attack on the Atlantic Council (!). How do they know it was a bot attack?
Well they based it off of this article by the Atlantic Council: https://medium.com/dfrlab/botspot-the-intimidators-135244bfe46b">https://medium.com/dfrlab/bo...
This incident was definitely a bot-based attack, but of a weird DDOS style harassment attack, rather than a "take control of the conversation" style attack. In other words, their training data set (at least for this paper) is based on a very narrow slice of bot activity...
...and probably not the activity people are thinking of when they see a headline like "Nearly Half Of Accounts Tweeting About Coronavirus Are Likely Bots".
But since there is no published paper for that particular press released, I am just guessing based on their prior research!
But since there is no published paper for that particular press released, I am just guessing based on their prior research!
In conclusion, holy shit, publish or at least preprint your damn research before you do a massively alarmist press release, my fuckin god
Maybe I should send out a press release and see what mainstream news outlets run with it: "Darius Kazemi, noted Twitter bot expert, says to CMU researchers & #39;nuh-uh, you& #39;re definitely wrong& #39;, based on research that he has not published yet and almost certainly exists"
NPR -- Researchers: Aurora Borealis Discovered At This Time of Year, At This Time of Day, In This Part of the Country, Localized Entirely Within Your Kitchen