This paper is the culmination of 6+ years of work with great friends around the world. Can't thank my collaborators, mentors, friends and fam enough. A thread:
Almost all carbon enters the biosphere through the Calvin-Benson-Bassham cycle. Ribulose Bisphophsphate Carboxylase/Oxygenase AKA rubisco is the enzyme that does the tricky bit of that pathway - attaching CO2 to an organic molecule.
Rubisco arose a long long time ago (> 2.5 billion years) when there was almost no O2 in Earth's atmosphere, so this oxygenation was certainly no problem "in the beginning." https://www.annualreviews.org/doi/abs/10.1146/annurev-earth-060313-054810
Today's atmopshere, however, contains ≈21% O2. Though we are used to talking about rising CO2 levels (true fact), it's also the case that geologic processes sequestered CO2 over very long times so that present-day CO2 levels (0.04%) are quite low in a historical sense.
So over geologic time, rubisco's carboxylation substrate (CO2) became scarce while the competing, off-pathway substrate (O2) became abundant. This was, we think, a big problem for photosynthetic organisms.
Another way is by locally increasing the concentration of CO2 via a CO2 concentrating mechanism or CCM. This approach guarantees that most rubisco active sites are processing CO2 and not O2. https://pubmed.ncbi.nlm.nih.gov/32428488/ 
Today, all Cyanobacteria and many Proteobacteria have a CCM that is based on two crucial features: (i) energized inorganic carbon uptake and (ii) very large protein organelles (~100 nm) called carboxysomes that encapsulate rubisco with a carbonic anhydrase enzyme.
@Jjdesmarais2, myself and other @SavageCatsOnly lab members recently performed a whole genome screen in a proteobacterial chemoautotroph that has a CCM (H. neapolitanus). https://www.nature.com/articles/s41564-019-0520-8?draft=collection
That screen highlighted a single genomic locus that encodes all the activities required for the CCM, at least in principle.
In this new manuscript we test whether the 20 genes in that locus are sufficient to make a CCM in a non-native bacterial host, namely E. coli.
*BUT* E. coli is a heterotroph and doesn't need rubisco for any reason, so we designed a mutant strain that needs rubisco carboxylation to plug a little hole we made in its central metabolism.
We call the strain CCMB1 for CCM background 1. When we give it rubisco and phosphoribulokinase, it can grow in glycerol media but *only in elevated CO2*.
This "high-CO2 requiring" phenotype is the hallmark of bacterial CCM knockout mutants, so we figured adding a CCM to the mix would enable growth in ambient air (i.e. low CO2, high O2).
At first, this didn't work - the cells didn't grow, even in 10% CO2. After a few years of confusion, Eli and I figured out a good way to do selections for growth in ambient air on agar plates.
We extracted the plasmids from the mutants, mapped all the mutations by sequencing, reconstructed them individualy, and showed that the plasmids alone confer growth in ambient air. That is, no mutations on the genome were needed.
We also tested a bunch of specific mutations to the CCM - messing up pieces of the carboxysome and transport system. All of the mutant CCMs grew in high CO2, but not in ambient air, as we expected. Shout out to Eli for heroic cloning work here.
The mutant growth phenotypes implied that growth in ambient air depends on the CCM as we currently understand it. If this is right, our cells should make carboxysomes and incorporate CO2 from air into their biomass.
Carboxysomes: check.

Shout out to @cblikstad for so much help with the EM.
CO2 from air into biomass: check.

Shout out to @GleizerShmuel, Roee Ben-Nissan and Elad Noor for LC-MS and analysis efforts.
Quick reminder that this strain is *not* an autotroph. It depends on rubisco carboxylation to fix a problem we created by knockout of a gene in the pentose phosphate pathway, but it still needs organic carbon to grow (glycerol in this case).
We showed that that rubisco is actively fixing CO2 from ambient air into amino acids extracted from cellular proteins, which implies that we've really constructed a CCM!
... But a lot of the carbon in the cell comes from glycerol. Hopefully someday soon we'll be able to build a CCM in a real autotroph and make it go.
For now, however, we've shown that at most 20 genes are required to build a functioning bacterial CCM. This reconstitution also enabled us to whittle down the list to 18 genes because it makes controlled genetic experiments a lot faster and simpler.
I think there is still a lot to learn about the evolution and function of CCMs, and I'm convinced that reconstitution is one of the best approaches: "what I cannot create I do not understand" http://archives-dc.library.caltech.edu/islandora/object/ct1%3A483
Though a more biology-focused version of this famous Feynman quote might be "what I cannot create I cannot come to understand" since we still need to do a bunch of experiments with the reconstituted CCM to figure out how the whole thing really works.
Since I'm quoting Feynman, it's probably as good a time as any to mention that I'm off to @Caltech sometime this fall for a postdoc (COVID permitting). If you're in southern CA, definitely hit me up.
Far too many people to thank (see voluminous acknowledgements), but I have to mention @SavageCatsOnly and @MiloLabWIS, my science homes for the last ~decade, my partner Rachel and my pops, Jack Flamholz, who passed away during my PhD and I miss a whole lot.
You can follow @flamholz.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: