|
Contact | Print version | Recommend article | RSS-Feed
High-Throughput and Complex Gene Expression Validation Using the Universal ProbeLibrary and the LightCycler® 480 System
King-Hwa Ling1,3,4, Chelsee A. Hewitt1, Sarah A. Kinkel1,3, Gordon K. Smyth2,3, and Hamish S. Scott1,3,4*
1Division of Molecular Medicine, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia; 2Division of Bioinformatics, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia; 3Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, Victoria, Australia; 4Division of Molecular Pathology, Institute of Medical and Veterinary Science, Adelaide, South Australia, Australia
*Corresponding author
Introduction
During the past decade, a considerable number of transcriptome profiling analyses have been reported. High-throughput approaches such as tagging-based serial analysis of gene expression (SAGE) and hybridization-based microarray technologies have been widely performed to profile gene expression [1]. However, these techniques are either laborious or costly, or may involve advanced technologies which limit the sampling depth, thus diminishing the sensitivity of analysis. Validation of gene expression profiles can be rapidly performed using an independent technique such as quantitative polymerase chain reaction (qPCR) to eliminate false results. In most laboratories, qPCR analysis has so far been performed on a small scale. This limits reliable downstream analysis of molecular interactions and pathway predictions, which are becoming important in the emerging field of systems biology. qPCR study design involves consideration of template quality and quantity, various levels of normalization, PCR efficiency (specificity and sensitivity), and data analysis. These factors are vital to ensure a reliable final quantification. The qPCR design becomes difficult when these factors are considered for a study involving multiple tissues, ages, genes, groups (e.g., treatment and control groups) and biological subjects. In this article, we discuss how the Universal ProbeLibrary and LightCycler® 480 System provided flexibility in high-throughput confirmatory qPCR analysis of data-sets generated from SAGE and microarray platforms in three complex studies (Table 1).
Materials and Methods
RNA and complementary DNA preparation
Total RNA was extracted from biological triplicates with additional DNase I treatment. The concentration and quality of isolated total RNA were determined using the 2100 Bioanalyzer (Agilent Technologies). Total RNA with an RNA Integrity Number greater than 8.0 was considered for subsequent complementary DNA (cDNA) synthesis [2]. First-strand cDNA was synthesized from 1-5 µg of total RNA using random hexamer priming.
Assay design and qPCR
All primers were designed using the ProbeFinder qPCR assay design software, which is freely accessible at www.universalprobelibrary.com. The following parameters were applied to all primers designed; length = 18–27 bp, GC% = 30–70, Tm = 59-60°C and amplicon size = 50–200 bp. Prior to qPCR, all primer sets were screened using PCR and conventional agarose gel electrophoresis. Only primer sets that generated specific amplicons were used for subsequent qPCR. All qPCR reactions were prepared in a total of 10 µl in a 384-well plate format to the final concentration of 1x LightCycler® 480 Probes Master, 200 nM forward and reverse primers, and 100nM of Universal ProbeLibrary probe. qPCRs were performed using the LightCycler® 480 System with a pre-denaturing step of 95°C for 10 minutes and 45 cycles of 95°C (10 seconds), 60°C (30 seconds), and 72°C (10 seconds) followed by a cooling step at 40°C for 1 minute.
Data analysis
The crossing-point (Cp) value (cycle number in a log-linear region) from each signal was calculated based on the second derivative maximum method performed by the LightCycler® 480 quantification software, which eliminates user interactivity during threshold selection and baseline subtraction. A set of serially diluted cDNAs was used to construct a 4 data-point standard curve for every PCR system. A total of two to three housekeeping genes (Hprt1, Psmb2, Pgk1, or Hsmb) were used as internal controls in order to compensate for variation in inter-reverse transcriptase-PCR (RT-PCR) set up, RNA integrity, RT efficiency, and cDNA sample quantity. The following criteria were adopted to define a successful qPCR assay; PCR efficiency between 90-110%, R-squared > 0.985, and a minimum of two successful housekeeping genes in each run. Based on a successful standard curve, the quantity of starting RNA of both target and reference transcripts was calculated as a linear function of logarithmic concentration and Cp. All estimated starting quantity of transcripts was in arbitrary units. The starting quantity of each target transcript was normalized to the starting quantity of each reference transcript generated from the same biological sample. Generated ratios from each of the target genes from various stages of development or subjects were then analyzed using linear modelling and empirical Bayesian moderated t-statistic provided by limma software package for the R computing environment (www.r-project.org) [3].
Results and Discussion
A total of 252 assays were designed. Of these, 235 (93%) assays generated specific amplicons observed by agarose gel electrophoresis analysis whereas 17 (7%) assays generated unspecific or no amplicons (Table 2). Successful pre-screened assays were used in qPCR analysis to validate differentially expressed transcripts (DETs). Based on previously defined criteria (see methods), 161 (69%) assays were successfully performed using the Universal ProbeLibrary and LightCycler® 480 System under the same amplification conditions (without any optimization). Despite generating a specific amplicon, 74 (31%) assays required further optimization. In most cases, failed assays were due to a low level of transcript expression, which leads to PCR inefficiency (< 90% or > 110%) and non-reproducible results (R-squared < 0.985). Figure 1 shows the different outcomes of agarose gel analysis during pre-screening assays. Most assays which generated specific dominant amplicons have high PCR efficiency and reproducibility.
By considering only the successful assays, 70 (54%) DET profiles were validated under the stringent analysis in the SAGE brain study. Most of the non-validated DETs in the SAGE study were generated mainly from short tags libraries, which were constructed from only one biological subject; 25 (78%) DET profiles were validated in the microarray brain study. The microarray datasets were generated from experiments that were performed on different biological subjects (triplicates). Consequently, the validation rate was higher (~80%) and therefore demonstrated better sensitivity compared with the SAGE technique. Depending on the study design or platform, the validation of expression profiles is vital to avoid false-positive results in downstream analysis before any pertinent conclusions or specific hypotheses can be deduced.
In this study, we demonstrated that both the Universal ProbeLibrary and LightCycler® 480 System provide a flexible platform for a successful high-throughput qPCR analysis. We consider there to be four main factors that contribute to a successful and feasible high-throughput qPCR analysis; 1) fast assay design and execution, 2) simultaneous analysis of multiple genes or samples, 3) proper data normalization and handling, and 4) manageable data analysis.
First, quantification of a large number of target genes involves a considerable amount of time in primer design. In addition, it is difficult and time consuming to design intron-spanning primers by manual navigation through various target gene sequences or target genes that are comprised of multiple transcript variants. In our context, the ProbeFinder software provided a batch assay design feature that automatically designed primers and matched a relevant Universal ProbeLibrary probe for up to 10 input sequences or transcript IDs at one time. Furthermore, both common and differentiation assays based on transcript variants can be performed easily. Since the UPL-based qPCR assay requires only standard primer synthesis, the time and cost required from primer synthesis to actual qPCR analysis are reduced compared with pre-validated assays or modified oligonucleotides.
Second, high-throughput qPCR involves simultaneous quantification of multiple transcripts and samples at the same time. Therefore, a robust universal protocol is required to produce specific, sensitive and comparable assays for each transcript on the same 384-well plate. Our results show ~70% of qPCR assays were successfully carried out using the same amplification condition without any optimization. The Universal ProbeLibrary provides multiple choices of hydrolysis probes, which have a high melting temperature due to the underlying locked nucleic acids chemistry [4]. The high temperature ramping rate of the LightCycler® 480 System, the intron-spanning primers, and Universal ProbeLibrary probe chemistry enable qPCR to be performed on cDNA template at a high annealing temperature, efficiently and with low background signal, thus improving sensitivity and specificity (Figure 1e). In addition, the 8- to 9-bases Universal ProbeLibrary probes occur frequently throughout the mouse genome. This provides flexibility in assay or probe selection during experimental design (a replacement assay or probe can be easily selected).
Third, high-throughput quantification of transcripts from multiple tissues at different developmental stages causes difficulty in choosing a housekeeping gene for normalization. To date, no single housekeeping gene has been reported to have consistent expression across different tissues through different developmental stages, at least not in an in vivo system. The best solution is to include multiple housekeeping genes for normalization purposes. However, the inclusion of 2–3 housekeeping genes will at least double or triple the total reactions per sample per run. Furthermore, this does not take into consideration the additional reactions necessary to construct standard curves for every qPCR assay. An increase in the number of reactions leads to problems in sample and data handling. Inclusion of reactions for controls, standard curves and all samples is considered crucial to reduce batch-to-batch variation if these reactions are performed on separate plates. In order to include all reactions on the same plate without complicating the post-analysis data handling, systematic placement of samples was used in all the expression profiling studies. This method is only feasible only with the LightCycler® 480 System because its thermal block has been proven to generate homogeneous temperature across the platform [5], and therefore, positional randomization of samples is not required.
Finally, analysis of large-scale qPCR data remains the most challenging part of such a high-throughput project. Data analysis involves data conversion, normalization between biological replicates and with multiple reference genes, and statistical comparison. To date, there is no one-step application that supports our customized workflow and stringent analysis criteria. Therefore, we used a customized pipeline designed in-house (not discussed here) for multiple comparative analyses.
Conclusions
In the development and revolution of qPCR analysis, both the Universal ProbeLibrary and LightCycler® 480 System provide an avenue to perform large-scale confirmatory and sensitive high-throughput qPCR analysis. This system generates vast amounts of expression data in a rapid format without compromising the quality or reliability of the final quantification results.
References
1. Velculescu VE et al. (1995) Science 270:484–487
2. Schroeder A et al. (2006) BMC Mol Biol 7:3
3. Smyth GK (2004) Stat Appl Genet Mol Biol 3:Article3
4. Peterson M, Wengel J (2003) Trends Biotechnol 21:74–81
5. Hoebeeck J et al. (2007) Biochemica 2:7–9
6. Vencio RZ et al. (2004) BMC Bioinformatics 5:119
7. Benjamini Y, Hochberg Y (1995) J R Statist Soc 57:289–300
This article was originally published in Biochemica 2/2008, pages 23-26. ©Springer Medizin Verlag 2008
|