I was hesitant to bring attention to this but I suspect it will be picked up soon. I am concerned that a recently posted preprint will cause another panic about a deadly mutant which is entirely unsupported by the data.
The preprint has identified a spike mutation V1176F and asserts that it is associated (along with another mutation in nsp7, L70F, closely linked to V1176F) with a higher mortality rate: https://doi.org/10.1101/2020.11.17.386714">https://doi.org/10.1101/2...
This preprint uses âpatient statusâ (i.e., alive or dead) at time of sampling as recorded in GISAID to look for mutations that have a statistical association with people who are more dead. The preprint also finds another mutation in nsp7, L70F, is closely linked to V1176F.
There are 168 genomes with V1176F and patient status in GISAID and 160 of them are from Brazil (these are lineage B.1.1.28 or clade 1 in https://science.sciencemag.org/content/369/6508/1255.full">https://science.sciencemag.org/content/3... - this paper notes the V1176F mutation). Of these 160 cases, 131 are listed as being dead (an CFR of 81%).
The preprint does a Fisherâs exact test and gets a vanishingly low probability that across all of Brazil this mutation could have been associated with such a high number of deaths by chance.
Of the 160 cases, 147 were sequenced by Instituto Adolfo Lutz and all 131 dead cases are in this set (CFR of 89%). IAL also sequenced 62 cases that werenât in lineage B.1.1.28 (primarily lineage B.1.1.33 which doesnât have V1176F) and 47 of these patients are also dead (75%).
Of the 168 cases with V1176F and with patient status, if you exclude the ones from IAL there are 20 remaining and all of them are recorded as alive. This suggests that IALâs study has an extreme bias towards dead people (either a convenience sample or a deliberate study design).
So all of the signal for V1176F being a deadly mutation comes from genomes sequenced by IAL. But the genomes sequenced by IAL are not independent and should not be treated as such. This was a lineage or cluster that was circulating in Sao Paulo and ended up in IALâs study.
If you remove all the IAL sequences the remaining 20 with V1176F are listed as alive. The entire signal comes from a study that is sequencing dead cases and the completely erroneous assumption that you can treat virus genomes as independent data points.
Patient status at time of sampling is an inherently flawed approach that is likely to cause spurious associations as only a few studies record it (and for these fatality may be part of the study design). And particular studies are more likely to have genetically similar viruses.
I should add that this preprint was originally posted here https://www.researchsquare.com/article/rs-95183/v1">https://www.researchsquare.com/article/r... about 2 weeks ago. I raised this issue with the authors then but it would seem they were comfortable enough with the results to post it to BioRxiv and hence this thread.
Oh and this preprint discovers the same mutation and the same cluster of Brazilian dead people: https://www.medrxiv.org/content/10.1101/2020.10.23.20218511v1">https://www.medrxiv.org/content/1...