The first is that tries to root a SARS-CoV-2 tree using the bat virus RaTG13. This is the closest non-human virus but still has > 1100 nucleotide differences to SC2. Note however the branch to the bat is a bit shorter than that for some reason.
Actually that is pretty much all I have to say about the content of the paper. The root is wrong and all the conclusions (including those in the Daily Mail) are simply not supported. And @edwardcholmes there is a reason we donā€™t let you pick the colours.
What is very upsetting about this paper is that there are a lot of people working incredibly hard to generate this virus sequence data to help with the public health response (there are now >5800 genomes from 65 countries).
There are also teams of people working round the clock to analyse the data using state of the art methods and provide analysis and interpretation (i.e., @nextstrain).
What really bothers me is that these authors pull down some data from #GISAID run it through an easy to use software package, make some very inappropriate choices for this virus and publish what they get out.
How did this then get published in a highly prestigious journal like PNAS? Because one of the authors is a member of the Academy and can ā€˜communicateā€™ his own paper. But what about the reviewers? One of them seems to be in Cambridge anthropology just like the authors.
Taking other peopleā€™s largely unpublished data that was shared to help the COVID response and publishing it would be OK if the findings were in anyway important or even correct.
Spawning a separate thread (a 'tree' if you like) with some more detailed, technical criticism. Whilst I don't think the thread will be long, it may take time to appear as I have work to do.
To kick off I took a dataset from about the same time (it is the GISAID data from 2nd April with 156 genomes). I added the RaTG13 bat virus and built a tree (in this case an ML tree using JC69). The red dot is the bat, the branch represents about 1200 mutations.
So basically the bat is so far away from the SARS-CoV-2 viruses its branch could fit in almost anywhere. Although there are lots of differences between bat and human there are very few within the human viruses.
In this case the tree has placed the root of SARS-CoV-2 on a virus from the USA and an identical Chinese virus. But that is essentially random. Next up - Bayesian analysis...
As an aside I am not sure why the equivalent branch in the paper's figure is not also very very long. It only has 15 mutations marked on it whereas it should have over 1000.
You can follow @arambaut.
Tip: mention @twtextapp on a Twitter thread with the keyword ā€œunrollā€ to get a link to it.

Latest Threads Unrolled: