Let's talk about DGRs, i.e. Diversity Generating Retroelements, one of the most elegant molecular mechanisms I've ever heard of (serious, they're really really really really cool)
In the first instance ever described, a DGR in a phage genome was used to introduce mutation in a gene involved in host recognition, i.e. if the host changed, the phage can quickly adapt (while keeping the rest of its genome intact).
More generally, DGRs are kind of like cheating at the (evolutionary) slot machine: instead of spinning everything at every round, you keep everything in place (e.g. the two 7s you already have), and only spin the one symbol you want to.
It's too complex to give it justice here, but suffice to say it involves an error-prone reverse-transcriptase, some imperfect repeat, plus some stem-loop, plus other things we have not characterized yet
Collectively, we had (as of 2019) about ~ 1,500 different DGR sequences identified, however it is still unclear exactly who is using these systems in nature and for what ? Is this mostly viruses ? cells ? Is it always used in virus:host interactions like the original DGR ?
And if you know JGI, you already guessed the next step.. "What if we searched every (meta)genome we have ?". So we first figured out how to systematically detect DGRs, then we mined public (meta)genomes, and went from the ~1,500 DGRs currently identified to > 30,000 (!)
First, DGRs are pretty much everywhere, but not evenly distributed, i.e. they are far more common in some places and organisms than in others. Human gut phages and bacteria (especially CPR) in aquatic environments are strongly enriched in DGRs (see the pre-print for caveats)
Note that this include a lot of phages infecting the dominant members of human gut microbiomes, so said otherwise: there's very likely a phage using a DGR right now in your gut.
Second, DGRs seem to be primarily (and maybe exclusively) used to diversify proteins which "bind to stuff". For phages, it's typically binding to the host cell.
For bacteria, it could be binding to other bacteria / particles, but DGR target seem to be mostly membrane-bound and include carbohydrate-binding domains (again there are a few exceptions).
It seems like DGRs are thus mostly used to change "external parts" of the phage and microbe. Kind of like these cables that are all the same inside but the connector part keeps (slightly) changing... A DGR would let you just change the connecting end.. (wouldn't it be nice..)
Anyway, binding to different compounds sounds potentially useful for most viruses/microbes, so why do we see DGRs strongly enriched in some ? We think it's linked to an evolutionary trade-off linked to "random binding",
e.g. in the ocean, binding to random sinking particles may not be the smartest thing ever. For phages, it could also be linked to the dominant host resistance mechanism (i.e. CRISPR vs cell wall modification vs endonuclease).
Finally, the majority of DGRs seem to be constitutively active. We know this by looking at metagenome time series, through which we can actually witness the end-result of DGR activity, i.e. changes in the protein sequences of the target.
Broadly, for both phages and cells, and across environments, most DGRs seem to be always "on", although we do see increased activity during "stress/perturbation" events.
To me that was the most surprising part: I expected DGRs to be tightly regulated and mostly inactive, and only "turn on" when mutations are needed. But instead, DGRs may be constantly maintaining a pool of variants in the population, to be selected if/when needed.
There are lots of other bits and pieces in there, and plenty of caveats and limitations to what we did, plus my analogies are "Friday evening with red wine" analogies, so if you're interested in these topics you should really no trust my twitter and instead read the pre-print. 🙂
And to close, many many thanks to everyone involved including especially @Blair @Sarah @HallamLab @TheHessLab @omalleylab and other co-authors not on twitter.
And a very special thanks to @crowe_lab, @mcmahon lab, @wrighton_lab and @LAHug_ who let us include the DGRs we found in their brand new (and unpublished) metagenomes in this analysis :-)
@TheWrightonLab was the correct handle 🙂
You can follow @simroux_virus.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: