The bacteria happily eating and reproducing and respiring in little plastic dishes sprinkled with nutrient broth in Jason Chin’s lab outside London look ordinary enough, but they differ in a fundamental way from every other living thing on earth, from fungi and avocados to tulips, robins, and elephants. They use a different genetic code — and yet, these artificial microbes are doing just fine.
In fact, these Escherichia coli have the most extensively “recoded” genome ever created, Chin and his colleagues at England’s Medical Research Council Laboratory of Molecular Biology in Cambridge reported on Wednesday in Nature. “This is a major milestone,” said Harvard biologist George Church, who was not involved in the new study. Here’s what you need to know about what he and other scientists are calling a landmark achievement in synthetic biology.
What’s ‘synthetic’ about this genome?
Everything. This is the largest genome ever made by stringing together DNA building blocks that scientists ordered from a supplier. That’s called “writing” a genome, something scientists at the GP-Write project are trying to do. (“Reading” a genome is what the Human Genome Project did — determine the sequence of its millions or billions of DNA letters, or bases.)
In 2010 genetics pioneer Craig Venter and his colleagues assembled the entire genome of the bacterium Mycoplasma mycoides this way, and scientists with GP-Write have synthesized 2 of the 16 chromosomes that make up the genome of a single strain of baker’s yeast. But the Mycoplasma genome is just 1.08 million pairs of bases, and the yeast chromosomes less than 1 million. E. coli’s is 4 million. Chin chopped that into 37 fragments and synthesized them, a process he called (of course) Genesis.
Basically, changing the genetic dictionary. Every organism on earth uses the same 64 codons (three-letter combinations of DNA’s A’s, T’s, C’s, and G’s) to specify the amino acids going into the proteins it makes. TCA, for example, specifies serine, meaning, “grab that amino acid out of the cellular soup and attach it to the protein the cell is manufacturing.” AAG specifies lysine. TAA means stop adding amino acids to the growing protein. But AGT also means serine, as do AGC, TCT, TCC, and TCG. If nature were efficient, it would use 20 codons for 20 amino acids, plus one for “stop.” Recoders whittle down the redundant codons and assign them new functions.
How did Chin achieve the recoding?
He and his team systematically replaced every occurrence of the serine codon TCG with AGC, every TCA (also serine) with AGT, and every TAG (stop) with TAA, for a total of 18,214 replacements. “There are many possible ways you can recode a genome, but a lot of them are problematic: The cell dies,” Chin said. For instance, supposedly synonymous codons nevertheless make slightly different amounts of proteins, and sometimes make proteins with unexpected characteristics that kill the cell. Chin discovered a recoding scheme that kept his E. coli alive and thriving despite using 59 codons rather than nature’s 61 to make all 20 amino acids, and two codons rather than nature’s three to say stop.
“They created a strain that isn’t using three of the codons that the rest of nature does,” said Tom Ellis, an expert in synthetic biology at Imperial College London, who reviewed the paper for Nature. “Life is still possible without all of nature’s building blocks.” Chin calls the creation Syn61, for the number of codons it uses.
How unprecedented is this?
Synthetic biology has gone gaga for recoding. In 2013, scientists led by Church replaced all 321 of E. coli’s UAG stop codons with UAA’s, creating organisms that survived with only 63 codons, though not via genome synthesis. Three years later, Church’s lab went further, replacing seven redundant codons with their synonyms, but in only a fraction of the E. coli genome, so it doesn’t get to be Syn57. Syn61 takes recoding further than ever, building on Chin’s 2016 research mapping out the recoding schemes he would use for the current work. With three codons jettisoned in an entire genome, it makes hundreds of times more changes than any previous genome recoding.
Why is that a big deal?
“Recoding challenges the formula for life,” Ellis said. “By recoding a genome, you can push the envelope of what nature has given us and see if you can do it differently.”
But what practical reason is there for doing that?
By freeing up, say, TCG and handing its job to AGC, scientists could give TCG a new function: coding for one of the hundreds of amino acids beyond the 20 that nature strings into proteins. With a recoded genome, a cell might be able to synthesize novel enzymes and other proteins.
“Nature has given us all these enzymes that can do all these funky tasks,” Ellis said, from making cheese to extracting fruit juice, producing biofuels and industrial chemicals, and detecting biomarkers in medical tests. “That’s from just 20 amino acids. Think what you could do with 22 or more. There’s the potential for making all sorts of new chemicals” for medicine, food production, and industry.
In addition, recoded genomes might be impervious to viruses, as Church said Syn61 might be. That raises the possibility of recoding the genomes of bacteria used to make everything from pharmaceuticals to food, where viral infections cost industry millions of dollars every year.
Did this new study do that?
No, but someone will. “The key advance is what this enables going forward,” said Boston College chemist Dr. Abhishek Chatterjee, who also reviewed Chin’s paper for Nature. He recently got E. coli to incorporate non-natural amino acids into proteins it makes, though not by Syn61-type recoding. That, Chatterjee said, “opens up whole new possibilities for the bacterial synthesis of biochemicals.” It also makes the dreams of GP-Write look more feasible: “We may be seeing convergence on a standard strategy” for writing genomes, Church said, including for viral resistance. Chin’s work, he said, “will greatly embolden the rest of the GP-Write community, which have been working to make many organisms — industrial microbes, plants, animals, and human cells — resistant to all viruses by this recoding approach.”