Genome research is getting faster: AI tool identifies genes in newly sequenced organisms—without lab tests

“It’s like suddenly recognizing paragraphs, chapters, and individual words in a completely unfamiliar book”

28-Jan-2026
AI-generated image

Symbolic image

Researchers at Forschungszentrum Jülich and Heinrich Heine University Düsseldorf have developed a tool that could significantly transform genome research: Helixer identifies genes directly from DNA sequences – without laboratory experiments or prior knowledge about the organism.

Before biologists can say anything about the genetic characteristics of an organism, they must first know where the genes are located within the long string of DNA letters. This process, known as gene annotation, is one of the most challenging steps in genome analysis. Until now, it required extensive experimental data or well-studied related species for comparison. Helixer now greatly simplifies and accelerates this work. The AI detects typical features of a gene – start and stop signals as well as structural elements such as exons and introns – directly from the sequence.

“It’s like suddenly recognizing paragraphs, chapters, and individual words in a completely unfamiliar book,” explains Marie Bolger from the Jülich Institute of Bioinformatics (IBG-4). “This makes genome research much faster – and possible at all for many species.”

Helixer is the first AI tool that can reliably identify genes in such diverse groups of organisms – from plants and fungi to insects and vertebrates. Every year, thousands of genomes are sequenced worldwide, many of them from species that have barely been studied. For these cases, Helixer can now deliver immediately usable gene information that previously required months of work.

The AI predicts gene boundaries, reaching almost the quality of manually curated reference annotations – and it does so without using extra data. In vertebrates, Helixer demonstrates a high level of accuracy and consistently outperforms established gene prediction tools across a wide range of species. Thanks to deep learning, Helixer’s gene structure predictions show markedly superior performance, particular for plants.

The research team had already presented the concept for Helixer in 2020 and has since developed it into a tool that achieves usable results. Another deep learning-based gene annotation tool from the University of Greifswald, Tiberius, which was released in 2024, currently achieves even better results for mammalian species, but is limited to this taxonomic group.

New momentum for the field of research

“We were able to show that Helixer works across a wide range of organisms – which is crucial for its use in plant breeding, biotechnology, and environmental research,” Bolger emphasizes. “These advances in AI-driven gene annotation are truly exciting for the field.”.

Genome sequencing was automated more than 20 years ago, generating an enormous wealth of data. Gene annotation, on the other hand, was long considered a bottleneck in genome analysis. Now it is catching up.

“For almost two decades there were no fundamentally new approaches in this field,” says Björn Usadel, Director of the Institute of Bioinformatics at Forschungszentrum Jülich and Professor at Heinrich Heine University Düsseldorf, “Helixer shows that modern AI methods can help overcome this bottleneck.”

Outlook

The results, initially published as a preprint on bioRxiv and now appearing in Nature Methods, have already been cited many times and attracted substantial attention in the research community – a sign of the tool’s growing importance. “We are already seeing Helixer used in many projects – from crop plants to insects that shape entire ecosystems,” says Usadel.

Future developments are already underway: PhD student Felicitas Kindel at IBG-4 is exploring innovative strategies to build on Helixer’s strengths and extend its capabilities.

Original publication

Other news from the department science

Most read news

More news from our other portals

So close that even
molecules turn red...