Skip to Main Content

For more than 50 years, the RNA remained hidden in a lymph node that had been snipped out of a 38-year-old man in what is now the Democratic Republic of the Congo. That nub of tissue, the size of a nail on a pinky finger, had been sealed up in a protective block of paraffin.

Once freed from its wax casing, scientists at the University of Arizona were able to extract from the tissue a nearly complete genetic sequence of an HIV virus — the oldest nearly full-length genetic code for an HIV-1 virus recovered thus far, and one that supports the theory that the virus that causes AIDS began to transmit among people within the first decade or two of the 20th century.

“It’s a lot of years of work in there,” Michael Worobey, the scientist whose group carried out the work, said of the research, which was recently posted to the preprint site bioRxiv. “Just on that sequence we’ve been plugging away for more than five years.”


Worobey, whose lab has repeatedly performed virologic archeology on old tissue and blood samples, said the paper has not yet been submitted to a scientific journal for publication. As such, it has not been through the peer review process where independent scientists kick its tires, so to speak.

But Oliver Pybus, a professor of evolution and infectious diseases at the University of Oxford, praised the work.


“Generating a complete genome … from an archived tissue specimen is technically impressive,” Pybus told STAT. “Although its discovery doesn’t substantially alter our current model of the early genetic history of the AIDS pandemic, it does improve our confidence in conclusions previously drawn from modern and partial HIV gene sequences.”

Dr. Jacques Pepin, an infectious diseases professor at the University of Sherbrooke, in Quebec, who has written about the history of the AIDS epidemic, called Worobey’s latest work a “technological feat.” Pepin is is working on a second edition “The Origin of AIDS,” due out in late 2020, and said this work will factor in to the updates.

The sample that was examined dates from 1966. The sequence extracted from it is older by a decade than the previous oldest full-length sequence. It provides a snapshot of what the virus looked like when it was circulating undetected in central Africa 15 years before a cluster of strange infections among gay men in the United States led to recognition of a new disease that was eventually called AIDS.

Genetic codes of viruses that infected people in earlier days of the AIDS epidemic can be used by scientists to try to date when the HIV virus moved from primates into people. By studying differences in the viral sequences, scientists estimate how long it has been since the known sequences could have diverged from a common source. It doesn’t tell them when the event happened, but it can suggest that it had to have been prior to a particular time, Worobey said, adding that the new data suggest the jump likely did not happen in the 1920s.

Although there have been a number of estimates of when HIV started transmitting among people, most now focus on the early 1900s. Pepin said it might even have occurred in the final years of the 1800s.

Worobey, who has developed a method for extracting viral genetic material from samples that he calls “jackhammering,” spent time in DRC about 20 years ago while working on his Ph.D. at the University of Oxford.

He learned about a repository of old tissue samples at the University of Kinshasa, and with the help of co-author Dr. Jean-Jacques Muyembe — a renowned Ebola expert and director of DRC’s National Institute of Biomedical Research — he received permission to study them, looking to see if any contained HIV RNA. The tissue specimens were extracted from patients in Kinshasa — then Leopoldville — for diagnostic purposes between 1959 and 1967.

“It was kind of moldering in cardboard boxes in a big heap in the back room,” Worobey said of the collection.

This is exactly the type of trove Worobey looks for to answer the kinds of mysteries his laboratory tries to crack — such as whether a French Canadian flight attendant brought the HIV virus to North America (he did not) or whether the virus responsible for the 1918 Spanish flu pandemic was already circulating in Northern France in 1916 and 1917. His lab is still working on the latter.

“A portion of my time over the last 15 years has been trying to figure out where those dusty old boxes of stuff might be and try to get to them before they end up in the garbage or disappear because the person who knows what’s in them dies,” he said.

More than a decade ago the lab was able to extract HIV RNA — though merely a fragment, not the whole genetic code — from a lymph node taken from 60-year-old woman in 1960. Since that work was published in 2008, Worobey and his team have been fine-tuning their extraction method, allowing them to get more sequence information from a positive hit.

Worobey said that having genetic data from the 1960s shows that the circulating viruses were then already extremely genetically diverse — meaning they’d been transmitting among humans for a while.

“The only way that that can happen is if there were several decades of evolution prior to the 1960s,” he said.

The researchers tested 1,652 pathology samples and found the HIV sequence in one. That, Worobey said, was a bit of a slog, crediting co-author Thomas Watts from his lab.

Though you might think the positive hit would have been a moment for jubilation, this is not eureka moment in science.

“It’s such a long-term project and there are so many times when you think that you might be on to something good but it just turns out to be some sort of non-specific reaction that we’re circumspect until we’re very, very sure that what we’re dealing with is the real thing,” Worobey said. “So it doesn’t lend itself to champagne cork popping. More of a warm glow of satisfaction after years of toil.”

An earlier version of this story incorrectly described the type of  genetic material extracted from the specimen.

Comments are closed.