Researchers in MDC’s Berlin Institute for Medical Systems Biology (BIMSB) have developed a tool enabling accurate sequencing of messenger RNA along with their tails for the first time. Professor Nikolaus Rajewsky and his team described this new tool, called full-length poly(A) and mRNA sequencing, or “FLAM-seq” for short, recently in the journal Nature Methods.
Messenger RNA (mRNA) is extremely important, translating DNA instructions into proteins that carry out specific tasks. There are hundreds of thousands of mRNA molecules inside each cell making proteins. Most all of those mRNAs have a “poly(A) tail” – a string of adenosines, which are the As of the basic RNA building blocks A, G, U and C. Researchers are seeking to understand the tails’ influence on mRNA and protein production.
“It is extremely important because almost all mRNA come with a poly(A) tail. Why is that?” said Dr. Ivano Legnini, a postdoctoral researcher in MDC’s Laboratory for Systems Biology of Gene Regulatory Elements and co-first author of the paper. “It’s key to understand how gene regulation works; that’s the process that gives each of our cells its identity.”
The full monty
Researchers strongly suspect that the length of poly(A) tails has an important role regulating gene expression. For many years, it was thought that the longer the tail, the more proteins mRNA produced. However, recently it was shown that correlation is true in some systems or development stages, but not across the board.
As sequencing technologies have advanced, researchers have been able to estimate the length of poly(A) tails and get an idea of which genes they belong to, but did not have complete sequences of mRNA or their tails.
Only in the past few years have new sequencers been developed capable of sequencing whole molecules with thousands of nucleotides. The MDC team, led by Nikolaus Rajewsky, head of the Systems Biology of Gene Regulatory Elements Lab and scientific director of the BIMSB, wanted to see if they could sequence the entire mRNA with its tail using this new technology. To do that, they first had to prepare what is called a “cDNA library” – the complementary DNA, or DNA equivalent, of mRNA and its tail.
While it sounded simple, it took the better part of six months for the team to figure out a protocol to copy the entire mRNA and its tail into a single strand of DNA. Building on the expertise of technician Salah Ayoub, the team tweaked the preparation process with different chemical tools until they found the optimal reaction. Crucially, they also added a string of Gs and Is to the end of the As, which allowed them to copy the entire poly(A) tail.
“We tailed the tails,” Legnini joked.
Their DNA library was put through the PacBio sequencer, which produces full-length sequences of single molecules. On the computational side of the lab, Dr. Nikos Karaiskos analyzed the long sequences of G, T, A and C letters and tested them for accuracy – both by testing samples with a known number of As, and by comparing against other sequencing methods and datasets. The results show the FLAM-seq method is extremely accurate.
The researchers tried it on all sorts of tissue types – human cancer cells, brain cells produced from induced pluripotent stem cells, worm cells – all with similarly successful results, showing that the protocol works for all mRNA.
“We are really able to look at the complete mRNA in just one shot without having to do complicated computations to assemble fragments to get an idea of the whole mRNA in the cell,” said Jonathan Alles, co-first author from the Rajewsky Lab.
Not all As
The initial analysis revealed some interesting insights. The tails are not all As. Previous research had also shown the very ends of the tails contained other nucleotides besides As, but the vast majority of the tails had been previously inaccessible. Now with FLAM-seq, the BIMSB team observed many more places in the tails where As were replaced with cytidines (Cs).
Another finding is that different mRNAs, even when produced by the same gene, have very different tail lengths, some 30 nucleotides long, others several hundred. If length is not correlated to protein production levels as previously thought, what it the purpose of such dramatically different lengths?
“These processes have been shaped by evolution for millions of years, I cannot believe this is just random,” Legnini said. “There must be a reason why mRNA comes with tails of different lengths.”
Now that the tool has been developed, proven, and a patent filed, Legnini said the real fun can begin, using FLAM-seq to investigate specific questions.
“We know that expression of genes is regulated by crosstalk of different processes, namely transcription, splicing, and tailing,” Rajewsky said. “I am very excited about FLAM-seq, which enables us to directly listen to this talk and will thus help us understand it.”
The group shared the method on Protocol Exchange, and encourages other labs to try it.