We focus on mutations / deletions that arose repeatedly as those are likely to increased viral transmission. Some arose independently hundreds of times. Below (panel A), four muts/dels associated with 'Variants of Concern' (VoCs). B.1.1.7 clade shown 3-8 o'clock in the trees.
Each time a recurrent mutation appears in the #SARSCoV2 tree, we count the descendants from that node with and without the mutation and calculate the ratio of offspring of either type. We then normalise those ratios over viral generation times and average over replicates.
We can estimate an association with #SARSCoV2 transmissibility for 625 mutations. There is a slight tendency for mutations to be associated to reduced transmissibility overall, in particular C->T ones that are often induced by the host immune system.
We find mutations and deletions positively associated to transmissibility throughout the #SARSCoV2 genome. Besides strong hits in the S protein, we get positive associations in NSP3, NSP6, Orf8, the nucleocapsid gene (N), and elsewhere in the genome ...
We next estimated the transmissibility of #SARSCoV2 strains based on the mutations / deletions they carry, under the assumption that their effect is independent of each other (i.e. disregarding possible 'non-linear' effects of mutations).
We observe two step changes in the estimated transmissibility of #SARSCoV2 over time. It first increased with the emergence of the D614G haplotype in early 2020, and then again with the emergence of the 501Y VoCs (B.1.1.7) in the second half of 2020.
Among all major #SARSCoV2 clades, we estimate B.1.1.7 to be most transmissible. We observe a subtle, but highly consistent trend for transmissibility of #SARSCoV2 lineages to decay over time, likely due to the accumulation of deleterious mutations.
At this stage, our #SARCoV2 strain transmissibility estimates should be considered as relative rather than absolute. We also didn't present any result about non-linear effects (epistasis) between mutations / deletions. More work is needed for us to share those with the world.
To pre-empt any question about B.1.617.2, which was designated as a VoC by PHE yesterday. There is no B.1.617.2 in our dataset, but at this stage we get a fairly unremarkable preliminary transmissibility estimate for B.1.617 of ~0.33.
This analysis was enabled by the GISAID EpiCoV database ( https://www.gisaid.org ). We gratefully acknowledge all contributing / submitting labs around the globe including COG-UK ( https://www.cogconsortium.uk ) who have openly shared large numbers of UK SARS-CoV-2 assemblies.
You can follow @BallouxFrancois.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: