About the series

In “Game of Genomes,” STAT national correspondent Carl Zimmer takes a narrative journey through the human genome — his own. The first journalist known to have acquired the raw data of his own genome, Carl spent months interviewing leading scientists about the latest in genome research to learn more about himself and about human genomes in general. This project will run in three parts.

Credits:

Story editor: Jason Ukman
Multimedia editor: Jeffery DelViscio
Visuals editor: Alissa Ambrose
Illustration: Molly Ferguson
Animation: Dom Smith
Web development: Corey Taylor, Ryan DeBeasi, Jim Reevior
Project manager: Tony Guzmán
Copy editor: Sarah Mupo

Episode 1: Man inside the hard drive

A

scientist recently pointed me out to his colleagues. “That is not Carl Zimmer,” he declared.

The scientist was Mark Gerstein. He was sitting at a table in his office at Yale University, flanked by two members of his lab. “Really,” Gerstein said, pointing to a slim hard drive on the table, “this is Carl Zimmer.”

By “this,” he meant the sequence of my genome, which was being transferred from the drive onto a MacBook.

“I’m quite serious,” Gerstein said. “In about five minutes, he will be in this computer.”

I had come to Yale to give Gerstein and his colleagues my genome to explore. I wanted them to help me find out what was in there.

I was doing something far different — and far more exciting — than getting a conventional genetic test from a doctor or sending my spit to a genealogy company. Those tests typically only determine snippets of a person’s DNA, providing the sequence of less than 1 percent of the genome. Instead, I had gotten my entire genome sequenced and had then managed to get hold of all the raw data — the information that scientists use to understand how people’s genes help make them who they are.

If you could have read the data flowing into Gerstein’s MacBook, you would have seen a spreadsheet from hell. Each row contained a string of A’s, C’s, G’s, and T’s in various combinations, running a couple hundred letters long, accompanied by a few cells containing short numbers and codes. All told, there were 1.2 billion rows.

Watching my genome flow into Gerstein’s computer made me a little giddy. I began writing about DNA sequencing in the 1990s, at a time when sequencing the human genome — any human genome — seemed about as easy as a manned mission to Mars.

It took hundreds of scientists — and about $3 billion — to assemble the first human genome sequence in 2001. Since then, the cost of DNA sequencing has crashed, while the accuracy has skyrocketed. Scientists have now sequenced the genomes of an estimated 150,000 people.

Despite this sequencing explosion, very few of the people who have their genomes sequenced get their hands on their own genomes. And those few people typically only get a highly filtered report. To get the raw data, as I managed to do, is almost unheard of. I am, to my knowledge, the first journalist, to do so.

So over the past several months, I enlisted Gerstein and two dozen other scientists to help me see what’s lurking in my own genome. They have generously volunteered their time and expertise, acting like scuba diving guides, leading me through undersea canyons.

GoG Episode 1

The experience has revealed to me quite a lot about myself — but also, more importantly, about human genomes in general, and the advances scientists are making in understanding them.

What have I learned? I’ve used my genome to look back a million years to our pre-human origins. I’ve discovered exactly which pieces of DNA I inherited from Neanderthals and how they may influence my health. I’ve learned that I have extra copies of some of my genes, and I’m even missing vast chunks of DNA found in other people. I’ve inspected the three-dimensional shape of some of my proteins, observing how mutations have changed the way they work and made me vulnerable to some diseases.

Perhaps just as importantly, I’ve learned just how hard it remains for experts to make sense of anyone’s genome.

Over the coming days, I’ll be unspooling what I’ve learned in a series of stories. I’ll also introduce you to some of the country’s leading genomics experts, who are discovering remarkable things about our DNA. I hope you come along and see what’s hiding in that hard drive.
Genome divider

GoG Episode 2

Episode 2: A code is broken

S

oon after I decided to get my genome sequenced, I found myself in a hospital in Boston, with a geneticist staring long and hard at my face.

“What I’m doing is looking for any facial features that would suggest an underlying genetic illness,” Dr. Robert Green told me, as he gave me an exam at Brigham and Women’s Hospital. “The shape of your eyes, whether your ears are low-set or not. The complexity of your ears.”

Green then had me walk back and forth across the office. I felt like a terrier trotting at the Westminster Dog Show. Green explained that some hidden genetic disorders leave telltale signs in our gait.

“Future clinicians may judge this to be unnecessarily cautious,” he said as he watched me pace. “But there is no standard for how we do whole genome sequencing. So this is how I’ve decided to do it.”

I had never watched a geneticist improvise before. But improvisation is par for the course when it comes to genomes. In the history of our species, we’ve never had the chance to look at all of our DNA before. Scientists are still figuring out how our genomes affect our health. And doctors like Green are still working out the rules for using genomes in medical treatments.

A month before my visit, Green had invited me to get my genome sequenced, as part of an educational program run by Illumina, the leading manufacturer of DNA-sequencing machines. We arranged for an exam, and now, having failed to find anything suspicious, Green ordered a blood draw. My blood was shipped from Boston to San Diego, where it was sequenced by Illumina. The process, including my registration at an Illumina-sponsored seminar, cost $3,100.

The Illumina team began the process by cracking open my blood cells and extracting their DNA. They can’t just read the DNA from one end to the other. For starters, a human genome is so big that it would take too long. The DNA might also snap apart into pieces during the process.

Instead, Illumina does something counterintuitive: It smashes the DNA into lots of fragments, makes lots of copies of those fragments, reads them all, and then tries to put their sequences back together.

To do so, they take advantage of DNA’s own capacity to make copies of itself. Each DNA molecule is actually a pair of strands assembled from building blocks known as bases. The bases are like the alphabet in which our genes are written. Instead of our 26-letter alphabet, there are only four different kinds of bases in DNA: A for adenine, T for thymine, C for cytosine, and G for guanine.

When our cells divide, they build a new copy of their DNA. They do so by splitting the old molecule into its two strands, and then building a new strand for each one. This process is remarkably simple. Each base can only pair with one other base: A with T, C with G. To read my DNA, Illumina mimicked this chemistry.

First, the Illumina team broke my DNA into short fragments, each about 340 bases long. Then they made extra copies of those fragments, so they could sequence each one many times over. To do so, they glued the fragments to a plate, which they then submerged in a bath.

In that bath were free-floating bases. Gradually, those bases locked onto the fragments, creating a corresponding strand. And each time a floating base locked onto a fragment, Illumina’s devices could record the reaction. A computer recorded those reactions, and used them to decipher the sequence of each of the original strands.

A team of researchers at Illumina then reviewed the data, evaluating my risk for 1,200 disorders, ranging from familiar ones like lung cancer to obscure ones like cherubism. (Don’t be fooled: Cherubism doesn’t make you look like an angel. It fills your jaw with cysts.)

I couldn’t help but worry about what they might find. As I approach my 50th birthday, I feel lucky to be in pretty good health, yet I wonder what the next few decades have in store for me. On my father’s side, my grandfather died in his 40s of a heart attack, and my grandmother died a decade later of cancer. My mother’s family has been more fortunate; my grandfather lived to 88, while my grandmother celebrated her 90th birthday last summer. How, I wondered, did I do in the genetic lottery?

A few weeks after my visit to Brigham and Women’s, I got a call from Sheila Sutti, a genetic counselor who works with Green. She had my results back from Illumina.

“The reason we’re doing this over the phone and not in person is that we didn’t find anything of clinical importance,” she said. “You had a very benign report, Carl.”

Illumina found that I might not respond well to certain medications, but didn’t find any firm evidence that I suffered from a genetic disorder. I was also a carrier for two diseases, I discovered, which means that I had one copy of a mutation that could make my children sick if they also got the same mutation from my wife. But since those diseases would have made themselves known when my children were young, I knew that those mutations were nothing for my family to worry about.

And that was that.

I felt relieved that Sutti didn’t have any terrible news for me. But after the relief passed, the whole experience made my genome seem very boring. That seemed wrong. I knew that every human genome is infinitely fascinating, if you can just look into it deeply enough.

Before I had my genome sequenced, I had an inkling that I would be let down. A few years ago, I was having lunch with Beth Shapiro and Ed Green, a husband-and-wife team of geneticists at the University of California, Santa Cruz.

“You know what you should do someday? You should get your genome sequenced.” Shapiro declared. “But then you know what you should do? You should get your BAM file. If you do that, you can bring it to scientists like us. Then you can really see what’s going on.”

I didn’t have the courage at the time to admit that I didn’t know what a BAM file was. But now, a few years after the conversation, it was time to find out.

Genome divider

GoG Episode 3

Episode 3: BAM reveals all

A

BAM file, I learned, is all the raw data that comes out of genome sequencing. (BAM stands for Binary Alignment/Map.) It’s a tremendous chunk of information, weighing in at 70 gigabytes, the equivalent of over 400 feature-length movies. As big as it may be, nothing else will do for scientists who want to explore a genome in its full complexity.

Yet getting your own BAM file, it turns out, can be surprisingly hard.

After Illumina sequenced my genome, it sent me a medical report and a link to a website where I could peruse its results. It did not simply hand over the BAM file — nor would it if I asked on my own. Delving into its paperwork, I found the reason why: Illumina states that it will provide BAM files “solely for use in clinical research.”

It’s not surprising that companies like Illumina are wary of simply handing over BAM files to the public. In 2007, the genetic testing company 23andMe began providing genetic tests directly to customers, rather than through a doctor. For each customer, it identified hundreds of thousands of genetic variants and interpreted the results to estimate how much risk they put people at for a variety of diseases. In 2013, the FDA demanded it stop selling the tests because the agency hadn’t validated them.

For its tests, 23andMe looked only at several hundred thousand genetic markers sprinkled across the genome. A BAM file, by contrast, contains information about all 3 billion base pairs in a person’s genome. What’s more, all that raw data contains errors that require an expert to weed out. Handing over such a huge pile of flawed data directly to customers can be a recipe for disaster. Customers struggling to interpret a BAM file for themselves may mistakenly self-diagnose themselves and run after treatments they don’t need.

But all this caution puts curious people in a difficult spot. Even a genome expert can be left out in the cold.

Brad Gulko is a graduate student at Cornell, where he is developing new methods to analyze genomes. For years, he’s been squirreling away some of his income into a fund to pay for his own genome sequencing. He wanted to get his own BAM file and study it.

“I have my hands in this stuff every day,” he said. “I would love to see where I am in this spectrum.”

Gulko waited patiently as his fund grew and the cost of sequencing DNA fell. By last year, the time was right. In July he contacted a sequencing center about getting his genome sequenced.

He was informed he could get his BAM file only if he could enroll in a research study and gain approval for his participation from an institutional review board. Try as he might, Gulko couldn’t figure out how to pull the right strings to get one. He remains a genome expert without access to his own genome.

“It’s a little crazy-making to know this information could be had and you can’t get to it,” Gulko told me. “You’re not allowed to know this stuff about yourself.”

If a scientist like Gulko couldn’t get his hands on his own genome, what chance did a civilian like me have? For help, I turned to Robert Green, the geneticist at Brigham and Women’s Hospital who had overseen the sequencing of my genome. He said he had a workaround. “It may be a bit of a clunky process,” he warned.

In addition to treating patients, Green is doing research on how genome sequencing will affect medicine. One of the studies Green is running, called PeopleSeq, is designed to find out how healthy people respond to getting their genome sequenced.

Most people in the study are only getting information about their genome filtered through their doctor. Some are looking at Illumina’s carefully curated website. But Green recently decided to expand the study to let participants get their hands on a hard drive with their own BAM file.

“We have created the protocol to return the hard drives, but have actually never done it yet!” he emailed me. “You might be the first.”

I joined the PeopleSeq study, and Green’s team then asked Illumina for the BAM file. They also had me sign a form stating that I understood that the data hadn’t undergone the quality checks that Illumina had used to generate my clinical report.

Green and I first talked about this plan in August. Summer turned to fall, fall to winter. Finally, in mid-January, a UPS box arrived at my house. Inside was a tube of green bubble wrap, inside of which was a black fabric case shaped like a kidney bean. I unzipped the case, and inside I found a hard drive with a brushed-metal gleam. The process might be clunky, but it had worked.

I was ready to enlist scientists to look at my BAM file.

The first to agree to join the expedition was a scientist named Konrad Karczewski at the Broad Institute in Cambridge, Mass. I realized the BAM file was so gigantic that I couldn’t simply email it to him. The easiest thing to do would be to go old school. I would deliver my genome by hand.

I put the hard drive back in its bean-shaped case, dropped the case in my shoulder bag, and hopped on a train to Boston. I became my genome’s own personal courier.

Genome divider

GoG Episode 4

Episode 4: Rosetta Stones

S

unshine poured through a wide window at the Broad Institute in Cambridge, Mass., illuminating a wall covered in scribbles. Konrad Karczewski, a young bioinformatics expert, had just spent an hour doodling diagrams to show me how he had pieced together my genome, which was now displayed on his laptop.

The whole process had taken Karczewski — or, to be more accurate, the Broad’s servers — two weeks to complete.

DNA sequencing is so familiar to us now, in the news and on TV crime shows, that it’s easy to get the impression that reading a genome is as simple as pulling a book off a shelf and thumbing through its pages. In fact, Karczewski showed me, scientists are still learning how to assemble that text accurately and completely. If the genome is a book, it’s written in mysterious hieroglyphics, and scientists are still inventing Rosetta Stones to read it.

“Biology is complex,” Karczewski told me, with a resigned shrug. “We already knew that, but I don’t think I really appreciated how ridiculous the problem is until I started doing this.”

When Illumina sequenced my genome, what it actually did was read the sequence of 1.2 billion fragments of my DNA. At this stage, these fragments (known as reads) are like loose jigsaw puzzle pieces waiting to be put together. Making this puzzle even more challenging, some of those fragments contain mistakes due to bad chemical reactions.

To solve a jigsaw puzzle, we can refer to the picture on the box it came in, in order to find a part that matches each piece. Scientists like Karczewski have a puzzle box of their own, called the human reference genome. It’s a highly accurate sequence of a single person’s genetic material. Since all people have relatively similar DNA, Karczewski could use the human reference genome to pinpoint the location of many of my own reads.

But because the reads are so short and the human genome is so long, Karczewski didn’t want to simply run a brute-force search for matches. That would have taken centuries to complete. Instead, using some clever shortcuts, Karczewski needed only 30 hours to figure out where most of the reads belonged.

Next, Karczewski stripped errors out of the reads. He took advantage of the fact that Illumina produces so many reads that they overlap on my genome many times over. If you were to look at a map of my genome at this point, it would resemble an irregular brick wall, with overlapping reads stacked on top of the reference genome. Illumina’s machines sequence each base on average more than 30 different times.

That redundancy allows Karczewski to spot the errors in DNA sequencing. If one read has an A at one spot, while the other 30 have a T at the same spot, you can safely conclude that my genome has a T there.

This process can fix a lot of errors in a genome sequence, but it doesn’t work on many others. If you scan a genome at this stage, you may find a location where some reads point to an A, some to a C, some to a T, and some to a G. They offer no clue to what the correct base should be.

This kind of error can be caused by a mutation that chops out a large piece of DNA, or one that accidentally duplicates it. They confuse computer programs that try to match reads to the human reference genome; the programs end up pinning the reads to the wrong place.

Fortunately, scientists have recently written a number of programs that can spot places where insertions and deletions create this confusion. But it’s no small task to fix these errors — Karczewski needed 15 hours to identify insertions and deletions in my own genome.

Only after two weeks of this scrubbing and fixing did Karczewski finally allow the Broad’s servers to write out my genome sequence. I was then able to give other scientists access to this new and improved version of my genome, so that they could explore it.

I came back to the Broad when Karczewski had finished, to inspect his handiwork. To show me my genome, Karczewski opened a special kind of browser. Think of it as Google Chrome for DNA. It displayed my genome as a string of bases arrayed on a line that ran off of either side of the screen. The browser highlighted stretches of DNA in genes; Karczewski could zero in on any one of them for a close inspection.

To give me a sense of how good his reconstruction of my genome was, Karczewski navigated to a gene called HTT.

HTT is not just any gene. Certain mutations in HTT cause Huntington’s disease, a devastating disease that starts in middle age, leads to dementia, and ends with death. Unfortunately, these mutations are also very hard to recognize with standard genome-sequencing techniques.

The problem with HTT is that the mutations strike a region of the gene that is made up of the bases C, A, and G, repeated over and over. Healthy people have a wide range of CAG repeats. It’s only when people get 37 or more CAG repeats in HTT that they are at risk of developing Huntington’s. Repeating DNA is very hard to sequence accurately with short reads, because there aren’t any distinctive sequences to anchor them.

When Illumina sequenced my genome, it had not been able to reconstruct my HTT gene completely. Rather than make a bad guess, it simply left parts of it blank. When it used my genome to identify my risk for genetic disorders, it didn’t even try to determine if I would develop Huntington’s disease.

But now, thanks to Karczewski, I was looking at a complete sequence of my HTT gene. If I wanted to, I could just lean forward and count my CAG repeats.

“You know … I should have probably started with a crap-ton of genetic counseling before we did this,” Karczewski said.

Such are the risks you face when you take your genome into your own hands. Nonetheless, I quickly did some bioethical calculations.

It takes just one defective copy of HTT to cause Huntington’s disease. But none of my ancestors I knew of suffered from the disorder, making it unlikely they could pass it down to me. Yet I also knew that about 10 percent of cases of Huntington’s disease occur out of the blue, the result of a new mutation that adds a number of CAG repeats to a person’s HTT gene. But such events are very rare.

I was pretty confident that I would have a normal HTT gene. But in the unlikely event that I did have Huntington’s disease, I decided, I’d rather know now than wait to be horribly surprised.

“Let’s look,” I said.

And so we did.

The reference genome has 19 CAG repeats. We counted only 17 in mine.

If Karczewski’s reconstruction was accurate, then I don’t have to worry about developing Huntington’s disease.

Whole-genome sequencing is not yet accurate enough to serve as a reliable medical test for a particular disease like Huntington’s. People who have relatives with Huntington’s and want to see if they carry the mutation should get precise tests that determine only the sequence of HTT, ignoring the rest of the genome.

But just because whole-genome sequencing isn’t 100 percent accurate doesn’t mean it’s not valuable. Thanks to the careful assembly by Karczewski and other scientists, I at last had a reconstruction of my genome that I could explore.

Genome divider

GoG Episode 5

Episode 5: Individual Z dissected

I

walked into a conference room at Yale not long ago to find eight graduate students and postdoctoral researchers waiting for me on either side of a long table. They invited me to sit at the head. In front of me, on the opposite wall, was a giant monitor. On it read the words, “Individual Z Overview.”

For two weeks, these researchers had been poring over my genome, and now they were ready to share with me what they had found. I had been gratified by how eager they had been to help me, but puzzled, too. It was only when I looked up at the screen that I realized the answer. To them, I was Individual Z.

It was as if I was a frog that had hopped into an anatomy class with my own dissecting scalpel, asking the students to take a look inside.

The students all worked at a lab run by Mark Gerstein at Yale. I had asked Gerstein to look at my genome because he has studied thousands of human genomes over his career. He and his colleagues are experts at making catalogs of genomes — recognizing the genes and millions of other pieces that make them up, and figuring out how those parts vary from one person to the next. Yet when I approached Gerstein about my project, he admitted he had never looked at his own genome this way.

“I’d never have the courage to do this — I’m just too timid,” Gerstein admitted to me. “I’m a worrier. Every time there would be a new finding, I’d look in my genome to see if I had it.”

While Gerstein might be too nervous to look at his own genome, he seemed to take vicarious pleasure in looking at mine. “I really want to do this,” he said when I handed him the hard drive with my genome’s raw data. “I think this is the future.”

Gerstein transferred the data onto his computer and gave the hard drive back to me. Like Karczewski, Gerstein and his team then used a set of computer programs to analyze my BAM file and build a highly accurate reconstruction of my genome. Once they had reconstructed the sequence of my DNA, they could start identifying the parts that made it up.

The parts of a genome we’re most familiar with are, of course, genes. Each protein made by our bodies — such as the collagen in our skin and the myosin in our muscles — is encoded by a gene. Our 20,000 or so protein-coding genes take up only about 1 percent of our genome, however. They are scattered amid vast stretches of so-called non-coding DNA. Non-coding DNA is a mishmash of different elements. Some of them, like on-off switches for genes, are essential to our well-being. A lot of them are just along for the ride.

In order to map the parts of my genome, Gerstein and his colleagues took advantage of the fact that one person’s genome is pretty similar to anyone else’s. If you want to find your COL1A1 gene for collagen, for example, you’d best look about midway down your chromosome 17. That’s where it is in everyone else.

But while my genome is a lot like everyone else’s, it’s not identical. When a scientist like Gerstein sets out to catalog a genome, a lot of his work goes into tallying up my differences.

When I returned to Gerstein’s lab for my Individual Z Overview, Fabio Navarro, a Brazilian postdoctoral researcher with a scruffy beard, kicked things off by introducing me to a big number: 3,559,137. That is how many positions in my genome differ by a single base from the human reference genome — a single nucleotide polymorphism, or SNP for short.

For example, I have rare SNPs in a gene called MEFV. At one location in that gene, the vast majority of people have a base called thymine. But one of my copies of the MEFV gene has a cytosine at that spot. This variant gives me the rare distinction of being a carrier for a disease called familial Mediterranean fever, which causes runaway inflammation. (You need two copies to actually get the disease.)

It was a struggle for me to think clearly about the 3,559,136 other SNPs in my genome. I was tempted to think of them as making me an exquisitely unique genetic snowflake.

Sushant Kumar, another postdoctoral researcher who works with Gerstein, dispelled that illusion by picking out two people from a database of genomes to compare me to. One was a person from China, the other from Nigeria. Kumar found all three of us shared a lot of SNPs in common — 1.4 million, in fact. Kumar and his colleagues cut down my uniqueness even more by searching for my SNPs in a database they helped build, called the 1000 Genomes Project. They found over 91 percent of my SNPs in at least one other person’s DNA.

This enormous genetic overlap is the result of humanity’s sloshing global gene pool. Every new baby gains a few dozen new SNPs. They can pass on some of those SNPs to their own children. Over thousands of years, the variants spread from continent to continent.

Gerstein and other scientists want to understand how these SNPs influence our body. For now, most of what we know about the variations is limited to our protein-coding genes. But even in that 1 percent of our genome, our knowledge is limited.

I discovered, for instance, that I have a variant in a gene called HMGA2 that makes me a little taller. On average, people with my variant are about a quarter of an inch taller than people without it. But scientists don’t yet know exactly how it boosts the growth of people like me who carry it.

It’s likely that variants like the one in my HMGA2 gene influence my biology by changing the shape of my proteins. When proteins change shape, they work differently. During my visit with Gerstein’s team, one of his graduate students, Declan Clarke, provided me with a startling demonstration: He showed me the shape of some of my mutant proteins.

One of my mutations changes the shape of an enzyme in my liver. Our livers keep our blood clean by breaking down potentially harmful molecules so that they can get flushed out of our bodies. One of those enzymes, called NAT2, helps break down caffeine and other toxins with a similar molecular structure.

I have a variant in my gene for NAT2 that changes the enzyme’s shape. Clarke showed me how a pocket on my enzyme has an odd bulge. That bulge changes the way my NAT2 enzymes behave. In most other people, that pocket repels water molecules. In mine, it attracts them.

As a result, my NAT2 enzymes work slowly, allowing toxins to build up and linger longer in my body. Making matters worse, my defective pocket raises the risk that NAT2 enzymes will stick to each other, or to other proteins. To protect me from this damage, my cells destroy a lot of my NAT2 enzymes.

“Let’s say you have an old beat up car and you’re driving it around on the road,” said Clarke. “It’s like the other proteins are saying, ‘We have to impound this thing.’”

While getting rid of a lot of my NAT2 enzymes may reduce my risk of dangerous clumping, it also leaves me with even fewer of them. As a result, I end up doing an even worse job at breaking down certain toxins. And it’s not just toxins that can pose a problem: NAT2 helps break down certain medicines, too. Geneticists have found that my variant puts people at risk of bad side effects from those drugs.

While some mutations alter proteins, others destroy them. They disrupt genes so badly that our cells can’t use them to make any functional proteins at all.

A broken gene (technically known as a loss-of-function variant) can be a very dangerous thing. If you don’t have a functional F8 gene, for example, you can’t make an essential clotting protein. You get hemophilia and can bleed to death from a little cut.

In my own genome, Gerstein and his colleagues discovered 13 genes in which both copies appear to be broken. I have another 42 genes in which only one copy looks like it’s defunct.

It may sound strange that my genome has dozens of broken genes that cause me no apparent harm. If it’s any consolation, I’m no freak. The 1000 Genomes Project revealed that everyone has a few dozen broken genes.

Our genomes are not finely engineered machines that can’t tolerate a single broken flywheel or gear shaft. They’re sloppy products of evolution that usually manage to work pretty well despite being riddled with mutations.

I’ve probably passed down some of my uniquely broken genes to my children. Perhaps, long in the future, one of those broken genes will become more common in humans, and end up in every member of our species. That’s certainly happened in the past. My genome catalog includes about 14,000 genes that have been broken for thousands or millions of years, known as pseudogenes. Once they lost the ability to make proteins, they simply became extra baggage carried down from one generation to the next. Thanks to a genetic roll of the dice, they ended up becoming common. Now these 14,000 pseudogenes are found in all humans today.

“It’s neat — this is evolution in process,” Gerstein said. The unbroken continuum from my own broken genes to humanity’s shared pseudogenes is testament to the long, error-filled journey that produced our complicated, baffling genomes today.

I left my Individual Z Overview with a sense of how SNPs can change my proteins, and thus change my biology. But now that I had an assembled, catalogued genome, I knew that there was still a lot more to explore. There were pieces of non-coding DNA in my genome that influence my health. There were huge pieces of DNA that have vanished from my genome. I could even discover mutations that protect me from diseases. To join me on this next stage of the adventure, come back for the next installment.

(Some of the scientists who analyzed my genome kindly provided their technical results, which you can see here.)

Season2_Post
Season3_Post

Leave a Comment

Please enter your name.
Please enter a comment.

  • It’s really an interesting way of explaining human genome in simple terms! While I was reading the first season-4th episode: Rosetta Stones, I found the explanation for human reference genome which is a bit different from actual explanation! I just wanted to clarify that human reference genome was built (assembled) not only from a single person’s genome, instead a group of people (13 if I am not wrong) by the Genome Reference Consortium. Why was it done like that? A single person’s genome cannot act as reference genome since from the time humans moved out from Africa, humans adapted different environments, food habits etc., We, humans differ by a very small percentage. Though the difference between each of is 0.01%, to say something called human reference genome used by scientific community from different places, it has to be a consolidated representation of few individuals from all over the world to avoid any biases.

  • Thanks for a great article!

    It appears the genetic data are only being compared to a few measurable attributes of the individual — such as height, weight and notable disease. The usefulness of DNA for predicting outcomes would be enhanced by collecting a much wider array of socioeconomic data, including variables for which no established link exists to DNA.

    For example, do top federal officials and corporate CEO’s share identifiable differences in their DNA from the average citizen? Do professional athletes? Fathers of twins? Entrepreneurs and explorers? Nobel prize winners?

    The work with DNA is all about predicting outcomes, so it is unreasonable to focus solely on the DNA while downplaying outcomes. Since this is a new field of study, it is foolish to rule out possibilities that have never previously been explored.

    This obviously creates security issues and ethical standards are needed to limit abuse, but in the long run it seems silly to take a ‘census’ and then not use the data in the most creative, productive ways possible.

  • Thank you for making this report easy to comprehend. You were very brave to undertake this study and then share it so openly.

  • Great series. I’m looking forward to reading future episodes. If you don’t mind I would like to make a suggestion. If an acronym such as SNP (single nucleotide polymorphism) will be important later in the episode, please repeat the full name occasionally. I was able go go back and find it but my memory for acronyms isn’t what it once was. Thanks for listening.

  • I had my DNA tested by Ancestry.com and then sent it to FamilyTree.
    I also got the raw data from Ancestry.com but haven’t done anything else with it. What are the chances of using what I have to get fully sequenced?

  • The first few hundred sequence fragments from various organisms were assembled by Richard Grantham in the 1970s and bioinformatic analyses took off. Unfortunately, those of us who wanted to understand the information lost out in the funding game to those who blindly sequenced. Thus, here is Carl’s genome (and many more). It is great that we can make sense of some of it, but much of it reads as mere gobbledygook! Indeed, some find it most convenient to label the gobbledygook “junk” and move on.

  • Good job, Carl! I’ve been with you since Parasite Rex.
    I’m working here on the ol’ Biome—(http://biomemechanic.blogspot.ca)—so why don’t you take the Large Organisms for the moment, and I’ll take care of our li’l friends!

    Keep up the good work!

    Nick

  • I find your writing so fascinating, it’s great stuf, and this latest journey through your genome is amazing. Looking forward to the next chapter/season, kudos!

  • Thank you for a delightful Series, I read your Book about E Coli several years ago and was waiting for you to tackle Human Genome, Look forward to future Episodes !