elving into my genome, I learned a lot about how genetic variants influence my health, putting me at risk of some disorders and protecting me from others. But I also wanted to search inside my DNA for my history — my own ancestry, and that of our entire species.
I am hardly alone in my curiosity. Many people are sending their spit to testing companies so that they can learn about their origins.
The science behind genetic genealogy is young, but growing fast. A couple decades ago, researchers could only compare one gene in different people to see how they’re related. Now companies like 23andMe and Ancestry.com can look at hundreds of thousands of genetic markers at once. Yet those markers are just a tiny selection of our DNA. For the newest studies on our origins, researchers are comparing entire genomes.
To explore genealogy’s new frontier, I paid a visit to the New York Genome Center. It’s a fascinating vista, I discovered, although you have to beware of the mirages.
The researchers in New York ran a series of tests on my DNA, each pulling out different historical clues. In one test, they looked at my Y chromosome. Men have 22 pairs of nearly identical chromosomes, along with an X chromosome they get from their mothers and a Y they get from their fathers. The other 22 pairs shuffle segments of DNA together with each new generation.
The Y, on the other hand, is passed down virtually unchanged from father to son.
That unusual inheritance makes the Y relatively easy to study. If a man acquires a new mutation on a Y chromosome, all his male descendants will inherit it. They can quickly identify themselves as cousins by the mutations that they share.
It turns out I have a Y chromosome called E1b1b1. In some ways, that’s not a big surprise. E1b1b1 is a fairly common Y chromosome among Jewish men. My father is an Ashkenazi Jew with roots in Germany and eastern Europe.
But E1b1b1 is hardly just a “Jewish” chromosome. E1b1b1 turns out to be common as well among men in southern Europe and North Africa. And it’s most common in Ethiopia and Somalia.
Does that mean I’m Somali? I’m pretty sure I’d get turned away from a family reunion in Mogadishu. Nevertheless, my particular Y chromosome probably originated in a man who lived thousands of years ago in the Horn of Africa.
Some of his male descendants made their way to North Africa and then hopped into southern Europe. Others spread across the Red Sea to the Near East, where the Y chromosome ended up in the Jewish population. My Y chromosome took a long journey, in other words, that its original owner did not.
“The Y chromosome is just from one ancestor,” Joseph Pickrell, a researcher at the New York Genome Center, told me. “It’s an ancestor that a lot of people care about, but it’s randomly chosen from thousands of ancestors.”
To get a richer view of my ancestry, Pickrell and his colleagues looked at my entire genome, seeing how common each of its genetic variants is in different populations across the world. On its own, each variant can’t say much. They may find, for example, that one of my variants is present in 50 percent of northern Europeans but only 20 percent of Nigerians.
“That gives you a tiny bit of information,” said Pickrell. “But if you sum that up over hundreds of thousands of genetic variants, it gets to be very, very powerful.” If you have hundreds of thousands of variants that are all more common in Nigerians than Northern Europeans, you probably got them from Nigerian ancestors.
For someone like me, the picture isn’t so simple. While my father is an Ashkenazi Jew, my mother can trace her ancestry back to Protestants in England and Catholics in Ireland and Germany. Different parts of my DNA thus point back to different origins.
To figure out the sources of my genome, Pickrell and his colleagues used a variety of statistical methods to divide my ancestry up into the most likely combination of groups.
They confirmed that I’m a pan-European mutt. One of their methods, for example, found that about 43 percent of my genome is Ashkenazi, 25 percent from northern or central Europe, 23 percent Italian, 6 percent southwestern European (Spain, Portugal, and France). I have a speck of Northern Slavic DNA (2.2 percent), and 1.3 percent that they couldn’t place.
“You should treat those numbers as an approximation of reality,” Pickrell warned. Basically, his analysis showed that I’m half Jewish — which fits nicely with all the Passovers and bar mitzvahs I’ve had with my father’s side of the family. But the notion that I’m almost a quarter Italian comes as a complete surprise (as much as I may like gnocchi). My numbers may shift in years to come, Pickrell said, as scientists add more genomes to their studies.
These variants can only take me back a few centuries into my ancestry. To dig deeper into my origins, the New York Genome Center scientists used another piece of software, called RFMix, to probe my genome. It takes advantage of the way DNA gets passed down from generation to generation.
As eggs and sperm develop, their chromosome pairs come together and swap chunks of DNA. That means that our genomes are actually assembled from ancient chunks of DNA.
“It’s a quilt, made up of a segment from one ancestor attached to a segment from another ancestor,” said New York Genome Center researcher Nathaniel Pearson. “And we’re trying to figure out where those segments came from.”
Close relatives share long, identical stretches of DNA in the chromosomes. Over the generations, those stretches get smaller. Yet even when they get downright tiny, RFMix can pick out matching chunks of DNA in relatives. It can even find them in people who share a common ancestor a couple thousand years ago.
Pearson and his colleagues used RFMix to see how my Ashkenazi Jews came to be Ashkenazi in the first place. According to one theory, a kingdom near the Caspian Sea populated by people known as Khazars converted en masse to Judaism a thousand years ago. According to another, waves of Jews migrated out of the Near East during the Roman Empire, spreading to different parts of Europe. Later, in the Middle Ages, they came together in Poland to escape persecution.
Pearson and his colleague Dina Zielinski tested these possibilities by looking for matching chromosome segments in genomes from a few representative groups of people in Europe and the Near East. To test the Khazar hypothesis, for example, they looked at genomes from an ethnic group from that region, called the Adygei. To look for an ancestry in the Near East, they added Palestinians and Druze to the analysis. And to include western Europeans in the mix, they used genomes from French people and Russians.
A lot of my genome, Pearson and Zielinski found, was most similar to French DNA. That’s probably due mainly to my mother’s ancestry. My father’s Ashkenazi DNA, on the other hand, matched up in many segments with the Palestinian and Druze genomes. In other places, it matched the Russian DNA. Very little of it looks like Adygei.
Pearson warned me that this kind of research pushes the science of ancestry pretty close to its limits. “We have to have tons of grains of salt on the table,” he said. But at the very least, my genome turns out to be an argument against the Khazar theory of the Jews. Hidden behind the Ashkenazi curtain of my DNA lies an origin in the Near East.
But when it comes to human ancestry, each curtain pulled away reveals another curtain further back in time. The Jews may have first gotten their identity as a people in the Near East, but they shared a much older identity with the rest of humanity. And I was about to discover that there were echoes of that ancient identity in my genome as well.
ike all of us, I have clues about my ancestors lodged in my genome. Some clues reach back to 18th-century shtetls in Ukraine, others to medieval villages in Ireland. But like all other humans, I also carry clues in my genome that were left there much, much earlier.
“Everything that went into our species is preserved in your genome,” Beth Shapiro, a geneticist at the University of California, Santa Cruz, told me.
I asked Shapiro to help me read that ancient history in my DNA. Shapiro is the codirector of the UCSC Paleogenomics Lab, where scientists pluck bits of ancient DNA out of all manner of remains, from 100,000-year-old American camel fossils to 700-year-old frozen caribou feces. But they can also analyze the genome of living organisms and tease out the history of their ancestors.
“Did you use your genome to create a PSMC plot?” Shapiro asked me when I got in touch. I didn’t even know what a PSMC plot was, which instantly made me want one.
PSMC, I learned, stands for pairwise sequentially Markovian coalescent, and it’s a method that allows scientists to turn a single genome into a time machine, so that they can visit ancestors who lived long before our species even existed.
The method takes advantage of the fact that we have two copies of each gene. We inherit one copy from our mother, and the other from our father. Very often, those genes are not quite identical.
For example, I have two copies of a gene called MEFV. Each copy has a handful of mutations not found in the other. One of the mutations on one copy can cause a genetic disorder called familial Mediterranean fever. But you have to have the same mutation on both copies of the MEFV gene. Thankfully, I only have it on one.
Each copy of the MEFV gene has its own history. The copy I got from my mother came from one of her parents, and so on back through time. The same is true for the copy I got from my father. If you were to go back far enough in time, the history of both copies of my MEFV would converge. They both came from a common ancestor.
It turns out that if you look at any gene in a genome, you can estimate when that common ancestor lived. Over time, genes pick up new mutations at a roughly regular rate. If two copies of a gene differ by a lot of mutations, they come from a common ancestor who lived a long time ago. If they’re identical, they may have originated much more recently — perhaps in an ancestor who lived within the past few generations.
Shapiro and André Elias Rodrigues Soares, a postdoctoral researcher in her lab, looked across my whole genome to discover its history. They divided my DNA into 30 million pieces, each 100 bases long, and then compared my two copies of each piece. If a pair was genetically similar, that meant I inherited them from a fairly recent common ancestor. If they were substantially different, that meant they came from a far older forebear. The oldest pieces of my genome come from my ancestors from several million years ago — back when they were small-brained hominins living in Africa long before our species emerged.
Shapiro and Soares also found that these ages aren’t smoothly spread out across history. Instead, a lot date back to certain periods in the past. That pattern reveals something profound about human history: It tells us how the population of our species rose and fell and rose again.
When a population is big, it has a lot of genetic diversity. New mutations arise in children, and they can be passed down to future generations. If that population suddenly crashes, a lot of that genetic diversity vanishes. Now parents can only pass down a limited number of genetic variants to their offspring. The genetic diversity of small populations is thus usually low.
Thanks to this difference, scientists have found, you can estimate the size of the human population in the past by comparing the versions of genes in a single person’s genome. If you can trace a lot of genes back to one particular period of time, that means that the human population was relatively big back then.
While this kind of analysis can only give a rough measure of the size of the human population in the past, it is still good enough to let us glimpse into our prehistoric past. In my case, here is what Shaprio and Soares saw:
Over a million years ago, my ancestors (and the ancestors of every living human) belonged to a relatively big population. It was probably Homo erectus, a species that emerged about 1.8 million years ago. Homo erectus stood about as tall as we do today, but had a brain only about two-thirds the size of our own. They made stone tools that they likely used to carve meat from animal carcasses and to dig up wild tubers from the ground.
By about 600,000 years ago, however, our ancestors experienced a dramatic decline — a decline inscribed in my genome. This collapse probably coincided with Homo erectus evolving into two new lineages — one that became Neanderthals and the other that became our own species, Homo sapiens.
“It’s just what happens when species split,” Soares told me. “You’re decreasing the population because you’re splitting one population in two.”
Then, my genome shows, my ancestors crashed again. At some point between about 75,000 and 50,000 years ago, a small number of Africans moved into the Near East. Their descendants later reached Europe, Asia, Australia, the Pacific, and the New World.
Today non-Africans may number in the billions, but they originated from a tiny group of travelers carrying only a tiny level of genetic diversity compared to what you can find in Africa. I still carry the genetic imprint of that little band that first made their way to a new continent.
“What it shows, beyond any shadow of a doubt, is that the entirety of your lineage is preserved in your DNA,” Shapiro said.
Astronomers use telescopes to see great distances, but they use different kinds of telescopes to see different features of the universe. Geneticists have telescopes of their own for looking back across the history of our species. PSMC plots are useful to reconstruct the size of populations over hundreds of thousands of years. But other telescopes can reveal other stories in our genomes.
After Shapiro and Soares showed me the deep history of my ancestors, I traveled to Cold Spring Harbor Laboratory to switch my telescope.
I went looking for Neanderthals.
n a warm, late winter morning, I paid a visit to Adam Siepel, a biologist at Cold Spring Harbor Laboratory on Long Island. In his office, a tiny rock garden on a shelf burbled with flowing water. Nearby, Siepel kept a picture of his children next to the brooding replica of a Neanderthal skull. I wondered if Siepel had arranged them as a family portrait: ancestors and descendants.
Siepel studies genomes to learn about the intertwined histories of humans and their extinct relatives. He and his colleagues are learning how we diverged from a common ancestor and then mingled our genes together. A couple weeks beforehand, I had asked Siepel if he wouldn’t mind looking at my genome to see how it fit into that history.
He had never gotten such a personal request before, but he decided to give it a shot. “You’re definitely pushing us into a new area,” Siepel told me. But the more he inspected my genome, the more interesting the exercise became. “It’s actually a little addictive once you start,” he admitted.
We were soon joined by Melissa Jane Hubisz, a Cornell graduate student who works with Siepel. Ilan Gronau, a colleague of his from Israel, joined in by Skype on a flat-screen television on the wall. They were ready to tell me about what they had found.
Neanderthals first came to light in 1856, when quarry workers dug up strange-looking bones in Germany. Since then, many more Neanderthal bones have been uncovered, giving us a fuller picture of Neanderthals. They were stout, barrel-chested, heavy-browed, big-brained humans. They lived across Europe and Asia, killing rhinos and other large game and decorating themselves with jewelry made of shells and eagle claws. They thrived for thousands of generations, as Ice Age glaciers advanced and retreated, only to mysteriously vanish 40,000 years ago.
In the 1990s, researchers discovered that bits of DNA were still lurking in Neanderthal fossils. Over time they figured out how to collect more material, and in 2013, they recovered an entire genome from a single Neanderthal toe bone. Comparing Neanderthal DNA to that of living humans, they found that we split from a common ancestor around 600,000 years ago.
As the scientists searched for more ancient DNA, they were able to recover another genome from a chip of finger bone found in a Siberian cave called Denisova. To their surprise, it belonged to a third lineage, which they dubbed the Denisovans. Denisovans and Neanderthals, the DNA revealed, split apart about 450,000 years ago.
When scientists began comparing the genomes of Neanderthals and Denisovans to living humans, they found chunks of DNA from these extinct humans were nestled inside modern human genes. After modern humans expanded out of Africa over 50,000 years ago, they appear to have interbred with Neanderthals several times. Today the genomes of all non-Africans are at least a couple percent Neanderthal DNA.
Denisovans, on the other hand, left their DNA almost entirely in today’s residents of New Guinea and neighboring islands.
Now a younger generation of scientists is digging deeper into the story of human interbreeding. Earlier this year, for example, Siepel and his colleagues reported that the flow of genes actually went both ways. Over 100,000 years ago, Neanderthals acquired modern human DNA in their genome.
At my request, Siepel and his colleagues turned their tools on my DNA. To trace my evolutionary history, they compared my genome to those of a few people from different parts of the world, along with the Neanderthal and Denisovan genomes. Then Siepel and his colleagues asked a very simple — but monstrously difficult — question: What version of evolution from a common ancestor would account for the way our genomes look today?
Siepel and his colleagues ran their computer program for days to try out thousands of different scenarios. “It builds a coherent model that has to explain everything,” said Siepel. Now Siepel and his colleagues were ready to show me how I fit on the human family tree.
As someone of European descent, their study showed, I shared a close ancestry with Asians. That’s because we all descend from the small group of people who expanded out of Africa over 50,000 years ago. I share an older common ancestor with living Africans. And I’m an even more distant cousin to Neanderthals and Denisovans — except for a small portion of my DNA, which comes from a relatively recent interbreeding with Neanderthals.
Siepel and his colleagues estimated that I have at least 2 percent Neanderthal DNA, but they warned that the true figure could be 4 percent.
A growing number of people are getting estimates of their Neanderthal DNA, thanks to the direct-to-consumer genetics company 23andMe. But simply knowing your percentage of Neanderthal DNA isn’t much more than a conversation piece.
“They just give you a number,” said Siepel. “They don’t tell you where you’re Neanderthal.” With the help of their statistical tools, Siepel and his colleagues could locate each of the Neanderthal chunks of DNA in my genome. It turns out I have over 1,000 segments of Neanderthal DNA measuring 10,000 bases or longer. The biggest of them spans 189,871 bases.
“I have a list of interesting regions,” Siepel said, pulling out a piece of paper. “There’s a lot I don’t know about these, but here are some that I’ve flagged.”
Siepel explained that one of my Neanderthal segments contained a gene called DSCF5. Variants in the gene have been linked to coronary artery disease.
“We can click on a few others that I found,” Siepel said, as Hubisz navigated us through a genome browser on the computer. He threw out names of other genes — CEP350, GPATCH1, PLOD2 (“catchy name,” he muttered).
Siepel didn’t know if these Neanderthal variants had any effect on my health for good or bad. “I don’t have a really coherent story,” he admitted.
I knew someone who might, though.
At Vanderbilt University, Tony Capra, a computational biologist, is using a DNA database kept by the school’s medical center to search for links between Neanderthal variants and an assortment of diseases.
“If I know your Neanderthal DNA, can I better predict your risk of, say, depression?” Capra wondered.
Capra and his colleagues found that the answer was yes. In one study, they examined more than 28,000 patients of European descent, checking their DNA at 1,495 spots to see if they had a Neanderthal variant there. They created a kind of genetic Neanderthal score for each patient.
The scientists found that people with similar Neanderthal scores also had similar risks for certain disorders, including depression, mood disorders, and actinic keratosis (scaly skin lesions caused by sun exposure).
I asked Capra and his colleagues, Joshua Akey and Selina Vattathil at the University of Washington, to take a look at my genome to see how my Neanderthal heritage affects my health.
They calculated my Neanderthal score and then compared my score to the results from their study on Vanderbilt patients.
“In the depression case,” he told me, “the Neanderthal DNA incredibly, incredibly, incredibly slightly decreased your risk. And it incredibly slightly increased your risk for actinic keratosis.”
In their research, Capra and his colleagues have also looked for individual genes that had effects that were strong enough to be detected on their own.
Scanning the electronic health records of their subjects, the scientists noted how many of them experienced each of more than 1,000 medical conditions. They then looked at the people with each condition, noting whether they shared any Neanderthal variant that were less common in other people.
So far, this search has led Capra and his colleagues to a handful of genes. One of those genes, called SELP, is involved in forming blood clots. The Neanderthal variant can cause people to make too many clots. “That gives you a risk of all sorts of nasty things, like stroke and embolisms,” said Capra.
Capra found that I have the modern human version of SELP. But I do have Neanderthal variants of a few other genes that may potentially raise my risk of other disorders. I use a Neanderthal gene to make certain proteins in my thyroid gland, for example. Those proteins end up in the gut, where they have been linked to digestive disorders. Another of my Neanderthal genes puts me at slightly greater risk of nosebleeds.
It’s possible that some surviving Neanderthal genes make us sick because they don’t mesh well with our own biology. But some Neanderthal genes that are harmful today may have been beneficial to our ancestors. A Neanderthal version of SELP, for example, could have helped our ancestors recover from injuries faster. “Having an increased coagulation response could help you seal up wounds more quickly,” Capra said.
In 10 years, Capra and his colleagues will probably know a lot more about this evolutionary game of roulette. When you take your genome to the frontier of research, you have to be willing to accept some murky results and some cruel teases. When I was visiting Siepel and his colleagues at Cold Spring Harbor, Gronau said something so casually that I almost missed it.
“There is some Denisovan gene flow in Carl’s genome,” he said.
I sat up straight. “What?”
“You have a tiny bit, which is more than I see in the other genomes that I’ve tried,” Gronau said.
That made no sense at all. The few Denisovan fossils ever found (the finger bone and a couple teeth) were discovered in a Siberian cave. The only people with substantial amounts of Denisovan DNA alive today are Melanesians — the people in and around New Guinea. I think I’d know if I was part Melanesian.
But scientists have also found some evidence that tiny traces of Denisovan DNA have spread into other groups of people. In Tibet, for example, many people carry a Denisovan version of a gene called EPAS1. Studies on these Tibetans show that it helps them cope with life at high altitudes.
“Yeah, I found it, too,” Hubisz said to Gronau.
Siepel looked at me with a grin. “How are you at high altitudes?” he asked.
The result might be an error. Or perhaps I really do have some Denisovan DNA — which was first acquired by Neanderthals, who then passed it on to modern humans. “Still, you can’t exclude the possibility that you have some Denisovan DNA,” said Siepel.
By midday, we were done inspecting my genome and hungry for lunch. Gronau signed off from Israel, and Hubisz, Siepel, and I stood up to stretch our legs. “I wish there was more of a punch line,” Hubisz said, almost apologetically. “We found a lot of data.”
“Well,” Siepel added, “we found that he was part alien. Maybe we can look into that a little more.”
f you want a full accounting of your genome, you can’t stop with your genes. We have about 20,000 protein-coding genes, but they make up only about 1 percent of the human genome. A complete catalog of a human genome has to include genomic parasites.
These parasites — which scientists call transposable elements — are a weird menagerie of viruses and parasitic chunks of DNA that have propagated themselves throughout our genomes over millions of years. Most of them are dead, disabled by ancient mutations. But a few continue to spread over the generations, putting us at risk of diseases and reshaping the genome along the way.
Transposable elements can be surprisingly hard to recognize, and so many are yet to be discovered. But scientists have already found well over a million of them in human DNA, which take up a staggering amount of space in our genomes.
“Right now, everyone in the field is going to agree that it’s going to be more than 50 percent,” Cedric Feschotte, a geneticist at the University of Utah, told me. But some new studies are raising the total to 65 percent. “It’s quite possible that the real number would be closer to 80 to 85 percent.”
I asked Feschotte and his colleague Aurelie Kapusta to take a look at my genome for one particular kind of parasite: viruses.
Normally, viruses hijack cells to produce new viruses that kill the cell as they escape to infect other cells, or other hosts.
But every now and then, viruses insert their genes in an egg or a sperm cell. Instead of killing the cell, the virus’s DNA gets carried down into an embryo, which grows up and can pass down the viral DNA to its own offspring. Once this happens, viral DNA can get copied and inserted back into the same genome, multiplying over generations.
Mutations gradually disable these viruses, leaving harmless genetic fragments in our genome. Scientists have discovered over 100,000 such fragments in human DNA, making up about 9 percent of our genome.
Scientists have discovered some viral fragments that are many millions of years old, judging from the fact that the same viral DNA can be found in the same spot in the genomes of other primate species. But some viruses have been making new copies of themselves more recently than that.
Feschotte and Kapusta pointed me to some viral segments in my genome that aren’t found in any other species than humans. They must be less than about 7 million years old — the time when our ancestors split from those of chimpanzees, our closest living relatives.
I was especially intrigued by one kind of virus that Feschotte and Kapusta found in my genome, known as HML-2. While it’s not found in any other living species, scientists have also found it in the genome of Denisovans, a mysterious, extinct lineage of humans. The common ancestor of humans and Denisovans lived about 600,000 years ago. So that particular virus must have infected my ancestors before then.
Along with viruses, scientists have discovered several other kinds of transposable elements, each with its own distinctive way of parasitizing our genome. At least 17 percent of our genome has been generated by a parasite known as LINE-1. LINE-1 was never a virus. Instead, it’s just a stretch of DNA that encodes a protein that inserts a new copy of its DNA back into the genome. LINE-1 is the granddaddy of all genomic parasites. Scientists have found LINE-1-like elements in the genomes of other animals, as well as plants, fungi, and even bacteria.
Incredibly, we carry another kind of transposable element that evolved to exploit LINE-1. About 60 million years ago, a gene mutated into a transposable element called Alu. Alus can’t insert their new copies back into the genome. Instead, they hijack LINE-1’s proteins and use them to do the job. This sinister trick has produced over 1.2 million Alus in the human genome today, making up at least 10 percent of the human genome.
While most transposable elements in our genomes are dead, a small fraction can still replicate themselves. For example, I have 1,024 Alus that aren’t in the human reference genome. And sometimes, these still-active parasites can be a problem.
Each time a new transposable element drops into our genomes, it’s as if we’re playing a genetic game of Russian roulette. Once a new transposable element is created, it gets inserted back into our genomes pretty much at random. “If it lands in the middle of a gene, it’s going to have a big effect,” Mark Gerstein of Yale University told me after searching my genome.
Indeed, scientists are finding that transposable elements can play a part in a number of diseases. Certain types of cancer cells, for example, carry new transposable elements that appear to help make them aggressive.
Fortunately, the transposable elements found in my genome looked normal. “Normal means we found them in people who don’t have disease,” said Christopher Mason of Weill Cornell Medicine, who independently took a look at my inner parasites. “No guarantees, though.”
As dangerous as transposable elements can be, we have come to depend on some of them. Mutations have transformed some transposable elements from parasites into useful tools for our own survival. We harnessed some genes from viruses, for example, and use their proteins to fight other viruses.
Alus, scientists have also found, have sometimes merged into our own genes. By adding their DNA to our genes, they have given rise to new proteins.
So I was fascinated to learn from Mason that he found four new Alus sitting in the middle of my genes — where perhaps they might someday become part of those genes in my descendants.
“It isn’t a cause for alarm,” Mason assured me. “In the long view, it could be a playground for evolution.”
y trip through my genome has been a strange one. I’ve discovered extra genes in my DNA, and broken ones. I’ve learned I have Neanderthal genes for nosebleeds, and genetic traces of a mysterious southern European ancestor in my not-too-distant past. My genome contains strengths and weaknesses, I’ve learned: I’m protected against an autoimmune diseases by one variant, while another puts me at risk of gaining extra weight.
Yet for everything we can know about our genomes, there’s far more left to decipher. Scientists are still figuring out how to make an accurate reconstruction of our DNA, for example. We are thus still a long way from a day when genomes will become a regular part of medicine.
“Everyone in this area knows there are huge pitfalls,” said Adam Siepel of Cold Spring Harbor Laboratory. The idea of using today’s genome-sequencing technology to let people make medical decisions is, to Siepel, “really terrifying.”
Siepel said that the process is too error-prone to be ready for medical prime time. “You’d have to be sure you’ve ironed all these things out. And we’re nowhere near ironing them all out.”
The trouble with genome sequencing starts at the very first step. In a sample of blood or saliva, scientists collect the DNA from millions of cells. While those cells all originated from a single fertilized egg with a single genome, they picked up mutations as they divided. “As soon as you have more than one cell, you can have differences in those cells,” said Christopher Mason of Weill Cornell Medicine.
A different kind of error — technical noise, as Mason refers to it — can creep in when scientists sequence the DNA. Chemical reactions can go awry. Some sequencing methods involve attaching glowing molecules to bases, for example, and sometimes the molecules fall off. “You’ve lost the flashlight,” said Mason.
DNA sequencing companies build in a lot of error correction to squelch technological noise. But once scientists get their hands on the raw data from genome sequencing, they have to wrestle with another kind of ambiguity — computational noise. Computer programs may map some fragments of DNA to the wrong part of the genome. They may fail to map other reads at all. Ultimately, the information about a given base may be so ambiguous that a scientist may have to just leave it blank.
On my travels with my genome, I got to experience this noise firsthand. When I asked Mason to analyze my genome, he counted up 3,998,314 SNPs — genetic variants each of which changed a single base in DNA. When I gave my genome to another expert — Mark Gerstein at Yale — he found 3,559,137 SNPs. Gerstein and Mason are among the top genomics experts in the country, and yet they came up with tallies that differed by 439,177 SNPs.
When I awkwardly brought up that difference with Mason, he didn’t bat an eye. “You can take the same genome and add it up and get a different answer,” he said.
If you sequence the same person’s genome twice, using the same sequencing machine and the same software to interpret the results, Mason explained the results will not be identical. “Ninety-five percent is about as good as it gets,” he said.
That kind of accuracy is good enough to do research on genomes. Early telescopes weren’t terribly accurate, either, and yet they still allowed astronomers to discover new planets, galaxies, and even the expansion of the universe. But if your life depended on your telescope — if, for example, you wanted to spot every asteroid heading straight for Earth — that kind of fuzziness wouldn’t be acceptable.
Improving accuracy is only half the struggle for scientists who are trying to bring genome sequencing into the clinic. Once medical geneticists can look at the genomes of their patients, they have to decide what all the genetic variants actually mean for their health.
To see how medical geneticists are learning how to read genomes, I went back to Boston to see Robert Green, the doctor who had originally invited me to get my genome sequenced and fall down this deep rabbit hole.
I met Green at his office at Harvard Medical School with his colleague, Matthew Lebo, the director of bioinformatics at the Laboratory for Molecular Medicine at Partners HealthCare. The two of them had already been inspecting my genome for a while when I arrived. An alphanumeric splatter filled their monitor.
Green and Lebo knew they couldn’t possibly inspect every one of the millions of variants in my genome for a possible risk to my health. So they were narrowing down their list to the most likely suspects.
Most of what scientists know about genetic disorders concerns protein-coding genes, which make up only 1 percent of the genome. So, with a few keystrokes, Lebo threw away the millions of variants in my non-coding DNA. He and Green were now left with 81,000 variants on the screen.
Lebo and Green suspected a lot of those 81,000 variants were probably harmless, too. They applied another set of filters to throw out more of them. For example, they eliminated all the genetic variants that have not yet been linked to a particular disease in a published scientific paper. Now we were down to just 145 variants.
Even that list was probably too long. It was possible that some of the scientific reports that linked those 145 variants to diseases were wrong. So Lebo and Green used another trick to shrink the list even more. They reasoned that rare diseases had to be caused by rare genetic variants. So they got rid of all the variants that have been found in over 5 percent of people of European descent.
“Now we’re down to 18,” Green declared. “So how do we get meaning out of them?”
Lebo and Green had run out of automatic filters. Now they had to deploy their own expertise, closely inspecting each variant, reviewing the findings of other clinical genetics labs, and bringing a skeptical eye to the scientific literature.
Some of the variants held up to this scrutiny. For example, one of the variants on their list is located in my MEFV gene. It can cause a condition called familial Mediterranean fever, which I first saw flagged in my report from Illumina. Green and Lebo decided this was a solid finding.
This diagnosis didn’t make me fret. You need two copies of the variant to get the disease, and I only have one. And since my children have reached their teens without developing the symptoms, I’m pretty sure they didn’t get a copy from my wife.
But I wasn’t so blasé when Lebo and Green turned their attention to another gene, called DSG2.
“This is one that’s … interesting,” Lebo said, inspecting the screen.
Interesting is not a word you want to hear from a medical geneticist who is inspecting your DNA.
DSG2 makes proteins in the heart that make cardiac cells stick together. Mutations to DSG2 can potentially make it do that job worse. Geneticists who study the variant I carry have linked it to cardiomyopathy, a rare condition in which the heart walls become deformed, leading to sudden, fatal heart attacks.
I nodded as Lebo and Green explained my DSG2 variant to me. But inwardly I was desperately wondering why they weren’t bundling me into an ambulance to head straight to a cardiologist. Instead, they opened up a new window on Green’s monitor to look at a dense spreadsheet.
Over the years, the geneticists at Partners have collected medical details about their patients, and they’ve drawn their own conclusions about the risks posed by certain variants. They came to the conclusion that my DSG2 variant was probably harmless.
Lebo and Green then took a closer look at my DSG2 variant. While it is rare, it’s much more common than the form of cardiomyopathy that scientists have linked it to. That’s not what you’d expect if the variant caused the disease. Green and Lebo were now confident they could rule it out.
“A few years ago, this would have been worrisome for you,” Green told me. “If you stopped there, you’d say, ‘Oh, my God, I have this mutation!’ Today we’re going to call it ‘likely benign’ and relieve you of that.”
“How could geneticists get that so wrong?” I asked.
“It’s really shocking in a way,” Green said. “Everybody was doing it with the best intentions and the highest integrity. But the presumption was wrong.”
The sky outside Green’s window was getting dark and it was time for me to head home. “So there’s your genome,” Green said. He had a sly look on his face. “Ultimately, the more you know, the more frustrating an exercise it is. What seemed to be so technologically clear and deterministic, you realize is going through a variety of filters — some of which are judgments, some of which are flawed databases, some of which are assumptions about frequencies, to get to a best guess.”
Green shrugged his shoulders. “And that’s the state of the art.”
As Green and other scientists toured me through my genome, I learned enough to continue exploring on my own. When an interesting new paper on genetics comes out, I’ll sometimes check if I have the variant under study. Recently, scientists identified three variants associated with “subjective well-being.” I had two out of three.
This genomic noodling is great fun, although it may not mean that much to my own existence. But I also know that there will be other studies coming out that I’ll need to approach with more care. It turns out I don’t have variants in the BRCA1 gene that are known to raise the risk of breast cancer — a relief when I think of my daughters. But I do have six other mutations in my BRCA1 gene that scientists have yet to pin down as being dangerous or harmless. As scientists examine those mutations in the future, I’ll have to brace myself for more rolls of the genetic dice.
While I wait for answers, I realized, there is something useful that I could do with my genome. I could donate it to science.
In talking with scientists, I came across a number of programs that invite people to turn over their genomes to scientists for research. The New York Genome Center, for example, has set up a nonprofit site called DNA.Land. I’ve given my genome to them. I’ve also given it to the biobank at Brigham and Women’s Hospital. And I’m looking for more places to hand over my 3.2 billion bases. I’m not donating my genome because it’s exceptional. It isn’t. It’s thankfully boring. (Well, except maybe for the Denisovan bits.)
But even if it’s not exceptional, my genome can be combined with thousands of other genomes to someday reveal exceptional things about all of us.
(Some of the scientists who analyzed my genome kindly provided their technical results, which you can see here.)