Pharmacogenomics’ main focus has mainly been on the functional impact of a small number of variants in a single drug metabolism gene. As the field matures it has become clear that advances in understanding the genetic basis for drug effect must take a look beyond the genes that we expect to be involved in a drug pathway. The coincident completion of analysis on pharmacogenetic studies using genome-wide association (GWA) or other approaches to identify commonly altered genomic regions, and the ascendance of next-generation sequencing and analysis methods provides a compelling framework by which we can anticipate increasing the knowledge of rare mutations in affected cohorts. This is especially challenging in diseases such as cancer, heart failure, or mental retardation syndromes, where complex variation in copy number and sequence occur simultaneously within a given gene. Targeting these genomic regions of interest is a technology exercise that has been developing over the past several years, and typically involves the following steps once these regions have been identified:
Downstream functional studies may further enhance our understanding of how these variants play a role in disease susceptibility, onset or severity.
As an alternative to, or in concert with these targeted sequencing studies, light sequence coverage (4X) of the entire genome with paired end sequencing reads can be accomplished quickly and inexpensively by next generation sequencing platforms. When coupled with careful analysis of read pair mapping, these reads can be used to identify regions of amplification or deletion (copy number alteration or CNA), and other structural variations (inversions, cryptic translocations) that may be vital to the understanding of pharmacogenetic traits. Several years ago, in anticipation that targeted re-sequencing would be applicable to mutation discovery in cancer samples, The Genome Institute developed and scaled to high-throughput, PCR-based capabilities for this purpose. These capabilities include the full suite of activities needed for a “pipeline,” including automated PCR primer design, primer validation, and 384 well automated PCR reaction assembly, post-reaction processing and sequencing assembly at low volume to minimize DNA usage and cost. Commensurate with the need to analyze the capillary traces resulting from these efforts, we developed an automated analysis pipeline capable of identifying high quality single-nucleotide and indel variants, predicting the impact of the identified mutations on protein structure, and databasing the resulting information for downstream analyses such as correlation of mutational profiles across samples, correlation to clinical data elements, and so on. The use of this pipeline has resulted in an expansion of our understanding of mutations occurring in lung adenocarcinoma, acute myeloid leukemia, and glioblastoma multiforme. Coincident with our PCR pipeline development work, we also began investigating targeted capture by solution-phase hybridization. Based on our early success, and taking advantage of new technology such as the capability to produce up to 55,000 oligos that could be cleaved from the synthesis substrate (Agilent) as well as the massively parallel sequencing capability of next-generation platforms, we developed a novel capture method based on our original BAC-based capture. We refer to this method as “WUCap” (Washington University Capture), and now have incorporated it into the sequencing production activities at The Genome Institute. This approach has been applied to targeted capture and sequencing in the 1,000 Genomes project (pilot 3 studies), to the sequencing of 6,000 genes in ovarian cancer samples for The Cancer Genome Atlas (TCGA), and will soon be applied to a set of 500 genes suspected in retinitis pigmentosa, as well as to our NHGRI-sponsored medical sequencing project to investigate GWAS peaks resulting from case/control studies of metabolic syndrome in over 7,000 genomes.
As part of PGRN we use all these technologies to help further this network’s research. Our extensive LIMS system and sample intake pipeline are set up to receive thousands of samples, manage associated clinical data and track the progress of these projects.