Skip to Main Content

If a group of chemists found 18 more potent versions of a drug out of a sea of 3,000 potential chemicals in the span of a few weeks, they might be hailed as superhumans.

That actually happened at Relay Therapeutics, said Dr. Donald Bergstrom, the company’s head of R&D. But the driving force behind it wasn’t human at all — it was artificial intelligence.

AI and machine learning have been hailed as a powerful new tool for drug discovery. But despite the hype, there is still a huge gap between the potential and the reality.


“In principle, it’s useful for everything,” said Celsius Therapeutics’ president and co-founder Christoph Lengauer. But in practice, there are areas that may benefit more quickly from AI-based drug discovery, like immuno-oncology therapies and and treatments for autoimmune diseases. AI might be able to figure out the mysterious genetic cause at the root of a given condition.

Bergstrom, Lengauer, and two other experts — Berg’s co-founder and president Niven Narain and Vivid Bioscience’s chief scientific officer Mariana Nacht — joined STAT national correspondent Casey Ross for a discussion about hype, hypotheses, and hiring at the Broad Institute on Tuesday during Boston’s HUBweek. Here were the top takeaways.


Although it’s become a buzzword, ‘AI’ and ‘machine learning’ have different definitions.

What AI means can range widely. Some people conflate the two, though many agree that machine learning is more like a subset of artificial intelligence.

“AI does mean different things for different people.” Nacht said. “How we’re applying it actually defines what AI is to us.”

One very broad definition of AI includes all machines that analyze data without human guidance, Narain said. “Technically, you can make an argument that Excel is AI. You’re putting data in, and you’re getting a result back,” he said.

That everyone could install something that could be called AI may come as a shock to anyone following a field often described as cutting-edge. But AI really is not a new concept, Narain noted — it’s been around since the early 1980s. What is new is the level of optimism that investors and scientists have about the technique’s potential for results.

“Since 2015, there’s massive hype,” he said.

“It’s not the AI that’s the product. The result and the product is a better, specific drug target,” he said.

The aphorism ‘garbage in, garbage out’ is especially true for AI — and there is an awful lot of garbage out there.

AI works best with big sets of data — but not just any big data set will do. Ideally, scientists will have a few data points collected from a lot of people. But that’s often not what data from clinical trials looks like, Bergstrom said.

“In cancer clinical trials, we have a medium data problem,” Bergstrom said. “It’s too big for a human to discern pattern recognition, but not big enough for most algorithms to be able to make sense of it.”

A trial might have lots of data from only 1,000 people — which may seem like a lot of data, but may not be enough for an algorithm to determine the particular characteristics of which people might benefit most.

“It’s the perfect setup to make false discoveries,” he said.

Even with the right amount of data, the data itself must be reliable — which isn’t always a given.

“In drug discovery, it’s not good enough if an experiment works for a postdoc every other Friday,” said Celsius’ Lengauer. “When you’re in drug discovery, you need to make drugs that actually work,” he said.

Young machine learning scientists hold most of the cards — if they want to play them.

For jobs involving machine learning, the most experienced candidates may be 23 or 24 years old, Lengauer noted. But there aren’t that many of them — which can put the companies trying to recruit them in a tough spot.

“The problem is when they go, ‘Hey, you have to pay me a lot.’ And I’m like, ‘Wait a second, you’re a 23-year-old, I’m not paying you a lot,’” Lengauer joked. “But then they go, like, ‘Hey, you don’t have anyone else to hire but me.’”

The competition for the best and brightest can be fierce — even among industries.

“There’s a lot of super-smart people out there in that space, there’s such a demand — and not enough of them are going into health and into medicine,” Lengauer said. Instead, many chose to work in finance and advertising or “worse things like security or defense and shit like that,” he said, to laughs.

“I am totally willing to work with people who can’t spell DNA, but have this machine learning experience,” he said, if they can work well in a team to solve a problem.

Younger scientists may also shape the way data is shared — one of the biggest hurdles that machine learning projects face today, Narain said.

“I’m almost jealous of you,” Narain told a postdoc at Massachusetts Institute of Technology during the question-and-answer portion of the panel. That postdoc, Narain said, could push the open source and data sharing movements forward  “It’s folks like you who have the opportunity to create that sea change.”

That doesn’t mean the only scientists who can capitalize on the wave of interest in machine learning will be in their early 20s. “We really need to bring diversity of age, diversity of educational experiences and background,” said Nacht — “to really come at the problem to really give it a 360 view instead of the way we’re always slicing it.”