Thread by @K_G_Andersen, The SARS-CoV-2 furin cleavage site is yet again in the news

The SARS-CoV-2 furin cleavage site is yet again in the news - this time because of a quote by Nobel laureate David Baltimore.

The site is not a "smoking gun", nor does it "make a powerful challenge to the idea of a natural origin".

Quite the opposite, so a little science

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🧵" title="Thread" aria-label="Emoji: Thread">

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">

The SARS-CoV-2 furin cleavage site is yet again in the news - this time because of a quote by Nobel laureate David Baltimore.The site is not a "smoking gun", nor does it "make a powerful challenge to the idea of a natural origin".Quite the opposite, so a little science https://abs.twimg.com/emoji/v2/... draggable=

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">" title="The SARS-CoV-2 furin cleavage site is yet again in the news - this time because of a quote by Nobel laureate David Baltimore.The site is not a "smoking gun", nor does it "make a powerful challenge to the idea of a natural origin".Quite the opposite, so a little science https://abs.twimg.com/emoji/v2/... draggable="false" alt="🧵" title="Thread" aria-label="Emoji: Thread">https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">" class="img-responsive" style="max-width:100%;"/>

The furin cleavage site (FCS) / polybasic cleavage site is present in SARS-CoV-2 at the S1/S2 junction of the spike protein where it mediates the cutting (by the host protease furin, among others) of the spike, which is required for infection of cells.

The FCS was created by an out-of-frame insertion of "CTCCTCGGCGGG" creating the "(P)RRAR" amino acid sequence, which constitutes a suboptimal polybasic cleavage site that is important for expanding SARS-CoV-2 host range, it& #39;s transmission and pathogenesis, etc.

References for:

Possible host range expansion: https://jvi.asm.org/content/94/5/e01774-19

Transmission:">https://jvi.asm.org/content/9... https://www.nature.com/articles/s41564-021-00908-w

Pathogenesis:https://www.nature.com/articles/... href=" https://www.biorxiv.org/content/10.1101/2020.08.26.268854v1">https://www.biorxiv.org/content/1...

Furin Cleavage Site Is Key to SARS-CoV-2 Pathogenesis

SARS-CoV-2 has resulted in a global pandemic and shutdown economies around the world. Sequence analysis indicates that the novel coronavirus (CoV) has an insertion of a furin cleavage site (PRRAR) in...

https://jvi.asm.org/content/94/5/e01774-19

FCSs are abundant, including being highly prevalent in coronaviruses. While SARS-CoV-2 is the first example of a SARSr virus with an FCS, other betacoronaviruses (the genus for SARS-CoV-2) have FCSs, including MERS and HKU1. https://www.sciencedirect.com/science/article/pii/S1873506120304165?via%3Dihub">https://www.sciencedirect.com/science/a...

Furin cleavage sites naturally occur in coronaviruses

The spike protein is a focused target of COVID-19, a pandemic caused by SARS-CoV-2. A 12-nt insertion at S1/S2 in the spike coding sequence yields a f…

https://www.sciencedirect.com/science/article/pii/S1873506120304165?via%3Dihub

There is nothing mysterious about having a "first example" of a virus with an FCS. Viruses sampled to date only give us a teeny-tiny fraction of all the viruses circulating in the wild. Fragments - such as the CTCCTCGGCGGG - come and go all the time. https://www.biorxiv.org/content/10.1101/2021.02.03.429646v1">https://www.biorxiv.org/content/1...

Extensive recombination-driven coronavirus diversification expands the pool of potential pandemic...

The ongoing SARS-CoV-2 pandemic is the third zoonotic coronavirus identified in the last twenty years. Previously, four other known coronaviruses moved from animal reservoirs into humans and now...

https://www.biorxiv.org/content/10.1101/2021.02.03.429646v1

How did SARS-CoV-2 acquire the FCS? We don& #39;t know, however, we know four main mechanisms often lead to insertions:

(1) mutation

(2) polymerase slippage

(3) template switching

(4) recombination

All of which play key roles in coronavirus (incl. SARS-CoV-2) evolution.

While we don& #39;t know for sure how SARS-CoV-2 acquired the FCS, template switching is a very likely explanation with a plausible mechanism: https://link.springer.com/article/10.1007%2Fs00705-020-04750-z

We">https://link.springer.com/article/1... also find insertions - albeit not FCSs (yet) - in highly related viruses, e.g., RmYN02: https://www.cell.com/current-biology/fulltext/S0960-9822(20)30662-X">https://www.cell.com/current-b...

A Novel Bat Coronavirus Closely Related to SARS-CoV-2 Contains Natural Insertions at the S1/S2...

Zhou et al. report a bat-derived coronavirus, RmYN02, which is the closest relative of SARS-CoV-2 in most of the virus genome reported to date. RmYN02 contains an insertion at the S1/S2 cleavage site...

https://link.springer.com/article/10.1007%2Fs00705-020-04750-z

Template switching likely also play an important role during the ongoing evolution of SARS-CoV-2: https://www.biorxiv.org/content/10.1101/2021.04.23.441209v1.

We">https://www.biorxiv.org/content/1... need to see this in the context of the decades of evolution of the SARS-CoV-2 ancestor and related viruses in bats. It& #39;s safe to say indels come and go.

Insertions in SARS-CoV-2 genome caused by template switch and duplications give rise to new...

The appearance of multiple new SARS-CoV-2 variants during the winter of 2020-2021 is a matter of grave concern. Some of these new variants, such as B.1.351 and B.1.1.17, manifest higher infectivity...

https://www.biorxiv.org/content/10.1101/2021.04.23.441209v1

The FCS itself, (P)RRAR, is not an optimal site (for cleavage) and has never previously been used in CoV experiments to the best of my knowledge - unlike more optimal sites, which have been inserted into SARSr CoVs for basic research: https://www.sciencedirect.com/science/article/pii/S0042682206000900">https://www.sciencedirect.com/science/a...

Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell–cell fusion but does not...

The fusogenic potential of Class I viral envelope glycoproteins is activated by proteloytic cleavage of the precursor glycoprotein to generate the mat…

https://www.sciencedirect.com/science/article/pii/S0042682206000900

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht"> The exact same (P)RRAR FCS found in SARS-CoV-2 can be found in different viruses, including Feline coronavirus (FCoV), which is an alphacoronavirus.

Note, site not present in all closely related viruses and plenty of indels around the site - like SARS-CoV-2 vs SARSr CoVs.

The exact same (P)RRAR FCS found in SARS-CoV-2 can be found in different viruses, including Feline coronavirus (FCoV), which is an alphacoronavirus. Note, site not present in all closely related viruses and plenty of indels around the site - like SARS-CoV-2 vs SARSr CoVs." title="https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht"> The exact same (P)RRAR FCS found in SARS-CoV-2 can be found in different viruses, including Feline coronavirus (FCoV), which is an alphacoronavirus. Note, site not present in all closely related viruses and plenty of indels around the site - like SARS-CoV-2 vs SARSr CoVs." class="img-responsive" style="max-width:100%;"/>

If we zoom in on the (P)RRAR site in SARS-CoV-2 and compare it to the one found in (some) FCoV sequences, we can see there& #39;s a fair bit of homology outside the FCS too - including likely O-linked glycans being conserved.

The (P)RRAR FCS isn& #39;t optimal and while it& #39;s & #39;sufficient& #39; for SARS-CoV-2s & #39;success& #39; as a pandemic virus, it& #39;s not an ideal site as defined by the canonical R‐X‐K/R‐R FCS seen in many proteins (viral and otherwise). https://onlinelibrary.wiley.com/doi/full/10.1002/cti2.1073">https://onlinelibrary.wiley.com/doi/full/...

Furin‐mediated protein processing in infectious diseases and cancer

The serine protease furin regulates numerous processes in health and disease and has become a promising target for the treatment of viral and bacterial infections, as well as cancer. This review...

https://onlinelibrary.wiley.com/doi/full/10.1002/cti2.1073

The "P" from the (P)RRAR insert isn& #39;t directly part of the cleavage site itself, but, intriguingly, may regulate it via the nearby O-linked glycans.

This is seen in host proteins: https://www.jbc.org/article/S0021-9258(20)32890-8/fulltext,">https://www.jbc.org/article/S...

but also in SARS-CoV-2: https://www.biorxiv.org/content/10.1101/2021.02.05.429982v1">https://www.biorxiv.org/content/1...

Furin cleavage of the SARS-CoV-2 spike is modulated by O-glycosylation

The SARS-CoV-2 coronavirus responsible for the global pandemic contains a unique furin cleavage site in the spike protein (S) that increases viral infectivity and syncytia formation. Here, we show...

https://www.jbc.org/article/S0021-9258(20)32890-8/fulltext

Importantly, however, in recent month we have started seeing the "P" mutating towards residues creating more optimal furin sites - P681H and, especially, P681R, which can be found in B.1.1.7 and B.1.617.x, suggesting the virus may evolve towards more efficient usage of the site.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht"> So Baltimore& #39;s first point - that the FCS found in SARS-CoV-2 is somehow unusual - is simply incorrect. FCSs are found in a multitude of different coronaviruses, indels come and go frequently, and the exact (P)RRAR can be found in other coronaviruses.

Now, the codons. Here, Baltimore is talking about the two codons coding for the first two arginines (R) following the P - CGG. The CGG codon is rare in viruses because it& #39;s an example of an unmethylated "CpG" site that can be bound by TLR9, leading to immune cell activation.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht"> Despite being rare, however, CGG codons *are* found in all coronaviruses, albeit at low frequency. Specifically, of all arginine codons, CGG is used at these frequencies in these viruses:

SARS: 5%
SARS2: 3%
SARSr: 2%
ccCoVs: 4%
HKU9: 7%
FCoV: 2%

Nothing unusual here.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht">Furthermore, if we go back to the FCoV sequences and compare them to SARS-CoV-2 at the nucleotide level you& #39;ll see that FCoV also uses CGG to code for R immediately following the P. The next R is CGA (non-CpG) in FCoV, while it& #39;s CGG in SARS-CoV-2 - one nucleotide difference.

Furthermore, if we go back to the FCoV sequences and compare them to SARS-CoV-2 at the nucleotide level you& #39;ll see that FCoV also uses CGG to code for R immediately following the P. The next R is CGA (non-CpG) in FCoV, while it& #39;s CGG in SARS-CoV-2 - one nucleotide difference." title="https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht">Furthermore, if we go back to the FCoV sequences and compare them to SARS-CoV-2 at the nucleotide level you& #39;ll see that FCoV also uses CGG to code for R immediately following the P. The next R is CGA (non-CpG) in FCoV, while it& #39;s CGG in SARS-CoV-2 - one nucleotide difference." class="img-responsive" style="max-width:100%;"/>

We see CGG multiple times in different ways - here& #39;s an example comparing another "PR" stretch between SARS-CoV-2, RaTG13, and SARS-CoV in the N gene. Note how SARS-CoV-2 and RaTG13 both use CGG, while SARS-CoV-2 uses CGC for the first R, while later R& #39;s are coded by CGT or AGA.

One final point about the CGG codons in the FCS - if they were somehow "unnatural", we& #39;d see SARS-CoV-2 evolve away from "CGG" during the ongoing pandemic. We have more than a million genomes to analyze, so what do we find if we look at synonymous mutations at the "CGG_CGG" site?

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🚨" title="Polizeiautos mit drehendem Licht" aria-label="Emoji: Polizeiautos mit drehendem Licht">Remarkably stable. Specifically, CGG is 99.87% conserved in the first codon and 99.84% conserved in the second.

This is *very* strong evidence that SARS-CoV-2 & #39;prefers& #39; CGG in these positions.

R is coded by six different codons, yet the simple single transition "CGA" is only observed in ~0.02% of sequences. The second most & #39;popular& #39; codon at these sites is "CGT" (a transversion) at 0.11% frequency.

In other words - there is nothing unusual about the codons either.

So Baltimore& #39;s second point is also false, invalidating his hypothesis that the "FCS [...] with its arginine codons [...] was the smoking gun for the origin of the virus".

Baltimore does not provide any evidence to support his hypothesis and the data support a natural origin.

Does this disprove a lab leak? No. However, it disproves there being a "smoking gun" in the FCS and lends further evidence to natural emergence - but it also does not *prove* that scenario.

To this day, we have yet to see any scientific evidence supporting a lab leak.

A couple of other *key* references I did not get a chance to discuss:

https://virological.org/t/the-sarbecovirus-origin-of-sars-cov-2-s-furin-cleavage-site/536

https://virological.org/t/the-sar... href=" https://virological.org/t/naturally-occurring-indels-in-multiple-coronavirus-spikes/560

https://virological.org/t/natural... href=" https://virological.org/t/spike-protein-sequences-of-cambodian-thai-and-japanese-bat-sarbecoviruses-provide-insights-into-the-natural-evolution-of-the-receptor-binding-domain-and-s1-s2-cleavage-site/622

https://virological.org/t/spike-p... href=" https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001115

What">https://journals.plos.org/plosbiolo... others did I miss?

Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly...

A study of the natural origins of SARS-CoV-2 reveals very little adaptive evolution occurring since it emerged in humans, but strong evolutionary signals in the bat virus lineage from which SARS-Co...

https://virological.org/t/the-sarbecovirus-origin-of-sars-cov-2-s-furin-cleavage-site/536

Variants of

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten"> have come up - it& #39;s false. Specifically:

1. The events are not independent, hence the calculation is incorrect.

2. It& #39;s the same argument used by creationists about "irreducible complexity" - also false:

https://en.wikipedia.org/wiki/Irreducible_complexity

https://en.wikipedia.org/wiki/Irre... href=" https://www.americanprogress.org/issues/religion/news/2006/04/10/1934/the-flaws-in-intelligent-design/">https://www.americanprogress.org/issues/re...

Variants of https://abs.twimg.com/emoji/v2/... draggable=

have come up - it& #39;s false. Specifically:1. The events are not independent, hence the calculation is incorrect.2. It& #39;s the same argument used by creationists about "irreducible complexity" - also false: https://en.wikipedia.org/wiki/Irre... href=" https://www.americanprogress.org/issues/religion/news/2006/04/10/1934/the-flaws-in-intelligent-design/">https://www.americanprogress.org/issues/re..." title="Variants of https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten"> have come up - it& #39;s false. Specifically:1. The events are not independent, hence the calculation is incorrect.2. It& #39;s the same argument used by creationists about "irreducible complexity" - also false: https://en.wikipedia.org/wiki/Irre... href=" https://www.americanprogress.org/issues/religion/news/2006/04/10/1934/the-flaws-in-intelligent-design/">https://www.americanprogress.org/issues/re..." class="img-responsive" style="max-width:100%;"/>

As to Richard& #39;s final point - well... #introspection

Latest Threads Unrolled: