The New Genome Sequencer FLX System

More flexibility, more breakthrough applications - Roche continues to unlock the sequencing secrets that will change the world

Marcus Droege
Roche Applied Science, Penzberg, Germany

Introduction

In 2005, the Genome Sequencer 20 System was introduced into the global life science market. Based on the revolutionary 454 sequencing technology, this system is able to generate hundreds of thousands of sequence reads in a few hours at a fraction of costs compared with the traditional Sanger technology [1].

Since its introduction, and based on its enabling and versatile character, the Genome Sequencer 20 System has already significantly changed the scientific world in many research areas. Many sequencing applications can be addressed which were – for technical or economical reasons – impossible to handle in the past. Examples are the identification of unknown genes in the human genome based on transcriptome sequencing [2], the identification of complete new classes of small non-coding RNAs in mammalians [3,4] or C. elegans [5], sequencing of the Neanderthal man [6] and the woolly mammoth genomes [7], analyzing the relationship between obesity and the human gut flora [8], or the finding that the microbial diversity of the deep sea was widely underestimated [9]. More than 50 peer-reviewed papers have been published meanwhile, many of them in high-profile journals such as Nature and Science. Among these, five were selected as cover stories for journals such as Nature, Cell or Genome Research.

In December 2006, (i.e., approximately one year later), the next-generation system, the Genome Sequencer FLX system was launched. FLX stands for flexibility and reflects the outstanding versatility of the system. Due to its improved combination of read length and throughput and its significantly improved sequence accuracy, a broad variety of breakthrough applications can be addressed at a higher quality. It is reasonable to assume that among other upcoming second-generation technologies providing only very short sequence reads (approximately 35 bases), the Genome Sequencer FLX System is the most versatile system. It offers the broadest range of applications for research fields such as cancer research, genetic diseases, infectious diseases, plant genomics or meta­genomics and many more.

Overview on Genome Sequencer FLX Features and Resulting Benefits

Average read length of 200–300 bases, depending of the application and the organism

Read length is one of the most important key factors in high throughput sequencing (Figure 1). The longer the read length, the fewer gaps will remain in consensus sequences of whole genome sequencing projects, the more accurate the identification of highly variable alleles will be, and the more information on haplotypes will be gained. Longer read length also facilitates to allocate functions to EST sequences derived from the transcriptome, is the prerequisite for full length cDNA sequencing, and so forth.

Single read accuracy of more than 99.5%, substitution errors are exceedingly rare

Compared with the Genome Sequencer 20 System, significant enhancements in the single read accuracy were integrated into the Genome Sequencer FLX System (Figure 2). Currently, single read accuracies of >99.5% over the first 200 bases are typically achieved. Most notably, this error rate already includes small insertions and deletions caused by the presence of homopolymers. This means that substitution errors are exceedingly rare (<0.01%) and that the Genome Sequencer FLX offers high throughput sequencing with single read accuracies equivalent to or better than traditional Sanger sequencing.

Increased throughput leading to faster, more convenient and less costly data generation: 200 Mb per day

A large run on the Genome Sequencer FLX System typically yields an average of more than 400,000 reads at an average read length of approximately 250 bases (LR70 run). Since two such sequencing runs of less than 8 hours can be easily started within 10 hours, 200 megabases of sequence information can be generated per day. Per run only 12 Mb of data are generated and can be reanalyzed if software updates are released.

This increased throughput per plate leads to faster and more convenient data generation. For example, E. coli can be sequenced by performing three different 20- to 30-Mb runs using the Genome Sequencer 20 System. In comparison, one sequencing run is sufficient to sequence E. coli completely de novo in 8 hours with the Genome Sequencer FLX System.

Improved reagents concept

The Genome Sequencer FLX System comes with an improved reagents and fluidics concept, resulting in the following improvements:

- On-board mixing of sequencing reagents resulting in a reduced kit complexity and less expensive shipping and storing conditions

- Improved fluidics reducing run time and reagent consumption

- Improved reagent shelf life

- Faster rapid thaw protocols for reagents (2 hours at RT)

- Only one preventive maintenance per year due to automated tube calibration

- Even greater robustness

Significant cost reduction

Based mainly on the improved throughput (i.e., on the improved combination of read lengths and number of reads generated per sequencing run), costs per deciphered base are significantly reduced.

Even more applications to even more breakthroughs in science

Due to its enabling character, the Genome Sequencer 20 System has revolutionized many research areas during the first year of its introduction. Examples are the finding that the microbial diversity of deep sea samples is considerably more complex than previously thought [9], the identification of a new class of small non-coding RNAs called piRNAs [3,4], interesting insights into human evolution based on sequencing of Neanderthal man genomic DNA [6], or the identification of previously undetectable somatic mutations in cancer samples. The improved combination of read length, throughput and very high single read accuracy, makes the Genome Sequencer FLX System potentially suitable for even more innovative applications and provides scientists with even higher data quality (Figure 3). In particular, there are many research areas where long sequence reads are crucial for the generation of solid high-quality data, such as de novo sequencing, epigenomics, metagenomics, the identification of insertions and deletions in re-sequencing for medical research efforts, the analysis of alternative splice variants, and plant sequencing efforts.

Conclusions and Outlook

With the Genome Sequencer FLX System, 454 Life Science and Roche Applied Science have introduced their “second-generation Genome Sequencer System”. Based on the excellent combination of read length, throughput and very high sequence accuracy, it can be used for a broad range of applications.

This flexibility also offers researchers a versatile portfolio of applications to address such complex research fields as cancer research, inherited diseases, infectious diseases, or plant genomics.

In cancer research, for example, the Genome Sequencer FLX System can be employed as a basis for epigenomic studies, gene regulation analysis using expression profiling, chromosomal rearrangement analysis, genome wide identification of sncRNAs, or the genome wide identification of transcription factor binding sites. In addition, it can be used for the identification of alternative splice variants or alternative transcription start and termination sites, or for the re-sequencing of parts or complete cancer genomes, as well as for the sensitive identification of somatic mutations in cancer biopsy samples.

For many applications in the various research fields, read lengths >100 bases are required to uncover the comprehensive biological information encoded in DNA sequences, such as SNPs in both exonic or intronic sequences, structural variants, or very important small deletions in the range of 3–500 bases. An incomplete picture of the real situation is often seen when smaller read lengths are used. This results in misleading and expensive downstream analysis.

Currently, the Genome Sequencer FLX System represents the most enabling technology in the life science market. Using it, many new and unexpected breakthrough results can be gained in a short period of time.

References

1. Berka J et al. (2006) Biochemica 4:7–10

2. Ng P et al. (2006) Nucleic Acids Res 34:e84

3. Girard A et al. (2006) Nature 442:199–202

4. Lau NC et al. (2006) Science 313:363–367

5. Ruby JG (2006) Cell 127:1193–1207

6. Green RE et al. (2006) Nature 444:330–336

7. Poinar HN et al. (2006) Science 311:392–394

8. Turnbaugh PJ et al. (2006) Nature 444:1027–1031

9. Sogin ML et al. (2006) Proc Natl Acad Sci 103:12115–12120

 

This article was originally published in Biochemica 2/2007, pages 4-6. ©Springer Medizin Verlag 2007

More about Roche Diagnostics
Your browser is not current. Microsoft Internet Explorer 6.0 does not support some functions on Chemie.DE