Skip to Main Content

João Pedro de Magalhães scours the human genome for clues that might help us understand why people age and what we might do to stop that. Without fail, each time he’s done one of these studies, nearly every gene ends up having some kind of link to cancer.

“Always,” he said. “You always have some cancer-related genes in there.”


The University of Liverpool researcher started to wonder just how many human genes are associated with cancer, and set about doing an analysis of genetic papers on the online medical archive PubMed. Of the 17,371 human genes studied at one point or another in papers in the archive, the vast majority have some connection to cancer.

“I think for nearly 90% of genes for which there are publications, they mention cancer in at least one of those publications,” de Magalhães said. “That surprised me a bit. I think what it means is that people really study cancer more than anything else.”

On the one hand, his findings —  published in a commentary Wednesday in Trends in Genetics — are a bit of an academic oddity. But on the other, de Magalhães believes the results might indicate a trend that is complicating science’s ability to tease out which genes are underpinning true drivers of cancer and which are just passengers.


STAT spoke with de Magalhães about the trend and what it means for the future of genetic analyses in cancer. This interview has been lightly edited for length and clarity.

What were some of your first reactions to the analysis results?

I was surprised by how strong the effects are. Nearly 90% of genes are associated with cancer.  It’s like a tongue-in-cheek observation, you know? Like, hey, if you work on cancer, any gene is likely to be associated with cancer.

But there have also been people pointing out that when you analyze genetic networks, you need to control for the number of publications associated with any gene in order to gather therapeutic insights. So, if you do this type of analysis, you’ll have this bias that the vast majority of genes have already been associated with cancer.

Why does that make it more difficult to study cancer genetics?

The main challenge is that if you’re trying to interpret results or trying to identify new drug targets in the context of cancer, you have too many genes associated with it. If every gene can be associated with cancer, then figuring out which cancer-related genes are driving different types of cancer and identifying the best biomarkers becomes challenging. It becomes a problem of how we prioritize and study the genetics of cancer.

Finding a simple association is enough to have a publication. That’s the problem. By and large, many associations with cancer are quite — I don’t know if weak is the right word. They’re just correlations.

Funnily enough, I was talking about this work with a colleague and she said that something similar is happening for Covid now. A lot of people just finding associations because there’s such a huge research effort on Covid-19.

How can scientists avoid some of the pitfalls you describe and improve the study of genetics then?

It means you have to be careful. Unless you have direct genetic evidence, you have to be careful of cancer associations, and I don’t think most people do that. I would say I’m guilty of that as well. Also, if you want to associate a gene with cancer, if you study it hard enough then you probably will. A lot of the associations can be spurious, I think, but people can take the opportunity to say, “Hey, I found this gene. It’s associated with cancer. We need money to study it.”

That kind of sounds like a bad thing, but is it so bad? If everyone can wave this big flag and say, “Hey, my gene is also associated with cancer, and it might be important,” maybe that would help more people get funded to do basic science on random genes. Then who knows, maybe you actually do find something really important?

That’s a good question. I don’t know! In an ideal world, we’d have a lot more investment in research, and we’d be able to study all sorts of associations. I guess my take is that funds are limited, so we have to prioritize the funding allocation in some way because you cannot study every gene, right? Some are more important than others.

So, how do we pick the right genes to study?

It’s a gray area. Causal associations would be best. When there’s mutations in patients that are predisposed to cancer, that would be evidence of a causal role not just some association. One thing we’ve done is look at the number of publications associating a particular gene with longevity, but you can do the same with cancer. There’s a bit of a subjective element here, too, though.

Do you think that the vast majority of genes have been linked to cancer reveals something about cancer? Like it’s reinforcing this idea that our genetic machinery gets old, makes mistakes, and then it’s cancer?

Yes, that’s right. If you look at genome instability, it increases with age. You can see it has more predispositions and the number of mutations increases with age in human tissues as well. So, I see this as a factor predisposing you to cancer development.

Create a display name to comment

This name will appear with your comment

There was an error saving your display name. Please check and try again.