This thread is valuable for a *lot* of reasons in its own right. I am going to use it as an eg of the importance of sampling and genomic epidemiology https://twitter.com/trvrb/status/1265063921880150016?s=20">https://twitter.com/trvrb/sta... 1/n
For the uninitiated, genomic epi in infectious disease is sort of like DNA fingerprinting. It measures the relatedness of pathogens, so that when you find the same fingerprint, you can infer the same culprit. Except with genomes the culprit evolves, and picks up mutations 2/n
This means that over time lineages pick up small changes, called Single Nucleotide Polymorphisms or SNPs. A detective wouldn’t get a conviction on the basis that the fingerprint was similar but not the same. But for genomic epi SNPs matter 3/n
They matter because we know that the fewer SNPs, the more likely the two cases are part of the same transmission chain. Right? Well normally yes. But this is not a normal situation in case you had not noticed. All SARS-CoV-2 genomes are almost identical 4/n
Normally you can say “nah, no chance these cases are connected” because they’re too different, they have too many mutations separating them/ But the most recent common ancestor of this virus, for ALL sampled SARS-CoV-2 genomes to date, was around thanksgiving last year 5/n
That is not a long time in which to go global. It is also not a long time in which to accumulate the SNPs we need to work with. And to make things worse, we’ve also barely scratched the surface when it comes to which viruses are circulating where 6/n
It is clear that the US had infections from both Asia and Europe, and were distinct in the SNPs that had been picked up en route, so we can tell them apart ( I don& #39;t agree with the title of this preprint, but if does demonstrate this) https://www.biorxiv.org/content/10.1101/2020.04.29.069054v1.full.pdf">https://www.biorxiv.org/content/1... 7/n
Where the WA story gets hard is that there were 2 early genomes that were closely related, and so it was natural to group them together. The difficult thing is that without any knowledge of other possible sources, you will always pin it to the one you know about 8/n
In this case it was the WA outbreak. It is now pretty clear that there were more introductions and it is hard to pick among them. This aint surprising. This lays out why it& #39;s even expected. Things are hard when there& #39;s little variation https://www.nature.com/articles/s41564-020-0738-5">https://www.nature.com/articles/... 9/n
Sampling matters always. For genomic ID epi, when you are trying to infer where the infection came into your community, reflect that if you have not sampled the source you have no way whatsoever of detecting the importation, or what is actually going on in the population 10/end