%0 Journal Article %J Science %D 2020 %T Transcriptomic signatures across human tissues identify functional rare genetic variation. %A Ferraro, Nicole M %A Strober, Benjamin J %A Einson, Jonah %A Abell, Nathan S %A Aguet, Francois %A Barbeira, Alvaro N %A Brandt, Margot %A Bucan, Maja %A Castel, Stephane E %A Davis, Joe R %A Greenwald, Emily %A Hess, Gaelen T %A Hilliard, Austin T %A Kember, Rachel L %A Kotis, Bence %A Park, YoSon %A Peloso, Gina %A Ramdas, Shweta %A Scott, Alexandra J %A Smail, Craig %A Tsang, Emily K %A Zekavat, Seyedeh M %A Ziosi, Marcello %A Ardlie, Kristin G %A Assimes, Themistocles L %A Bassik, Michael C %A Brown, Christopher D %A Correa, Adolfo %A Hall, Ira %A Im, Hae Kyung %A Li, Xin %A Natarajan, Pradeep %A Lappalainen, Tuuli %A Mohammadi, Pejman %A Montgomery, Stephen B %A Battle, Alexis %K Genetic Variation %K Genome, Human %K Humans %K Multifactorial Inheritance %K Organ Specificity %K Transcriptome %X

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913073?dopt=Abstract %R 10.1126/science.aaz5900 %0 Journal Article %J Nat Med %D 2019 %T Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts. %A Frésard, Laure %A Smail, Craig %A Ferraro, Nicole M %A Teran, Nicole A %A Li, Xin %A Smith, Kevin S %A Bonner, Devon %A Kernohan, Kristin D %A Marwaha, Shruti %A Zappala, Zachary %A Balliu, Brunilda %A Davis, Joe R %A Liu, Boxiang %A Prybol, Cameron J %A Kohler, Jennefer N %A Zastrow, Diane B %A Reuter, Chloe M %A Fisk, Dianna G %A Grove, Megan E %A Davidson, Jean M %A Hartley, Taila %A Joshi, Ruchi %A Strober, Benjamin J %A Utiramerur, Sowmithri %A Lind, Lars %A Ingelsson, Erik %A Battle, Alexis %A Bejerano, Gill %A Bernstein, Jonathan A %A Ashley, Euan A %A Boycott, Kym M %A Merker, Jason D %A Wheeler, Matthew T %A Montgomery, Stephen B %K Acid Ceramidase %K Case-Control Studies %K Child %K Child, Preschool %K Cohort Studies %K Female %K Genetic Variation %K Humans %K Male %K Models, Genetic %K Mutation %K Oxidoreductases Acting on CH-CH Group Donors %K Potassium Channels %K Rare Diseases %K RNA %K RNA Splicing %K Sequence Analysis, RNA %K Whole Exome Sequencing %X

It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases. This includes muscle biopsies from patients with undiagnosed rare muscle disorders, and cultured fibroblasts from patients with mitochondrial disorders. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution.

%B Nat Med %V 25 %P 911-919 %8 2019 06 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/31160820?dopt=Abstract %R 10.1038/s41591-019-0457-8 %0 Journal Article %J Nature %D 2017 %T The impact of rare variation on gene expression across tissues. %A Li, Xin %A Kim, Yungil %A Tsang, Emily K %A Davis, Joe R %A Damani, Farhan N %A Chiang, Colby %A Hess, Gaelen T %A Zappala, Zachary %A Strober, Benjamin J %A Scott, Alexandra J %A Li, Amy %A Ganna, Andrea %A Bassik, Michael C %A Merker, Jason D %A Hall, Ira M %A Battle, Alexis %A Montgomery, Stephen B %K Bayes Theorem %K Female %K Gene Expression Profiling %K Genetic Variation %K Genome, Human %K Genomics %K Genotype %K Humans %K Male %K Models, Genetic %K Organ Specificity %K Sequence Analysis, RNA %X

Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.

%B Nature %V 550 %P 239-243 %8 2017 10 11 %G eng %N 7675 %1 https://www.ncbi.nlm.nih.gov/pubmed/29022581?dopt=Abstract %R 10.1038/nature24267 %0 Journal Article %J Am J Epidemiol %D 2017 %T Incorporation of Biological Knowledge Into the Study of Gene-Environment Interactions. %A Ritchie, Marylyn D %A Davis, Joe R %A Aschard, Hugues %A Battle, Alexis %A Conti, David %A Du, Mengmeng %A Eskin, Eleazar %A Fallin, M Daniele %A Hsu, Li %A Kraft, Peter %A Moore, Jason H %A Pierce, Brandon L %A Bien, Stephanie A %A Thomas, Duncan C %A Wei, Peng %A Montgomery, Stephen B %K Animals %K Disease %K Gene-Environment Interaction %K Genome-Wide Association Study %K Genomics %K Humans %K Models, Animal %K Sequence Analysis, RNA %X

A growing knowledge base of genetic and environmental information has greatly enabled the study of disease risk factors. However, the computational complexity and statistical burden of testing all variants by all environments has required novel study designs and hypothesis-driven approaches. We discuss how incorporating biological knowledge from model organisms, functional genomics, and integrative approaches can empower the discovery of novel gene-environment interactions and discuss specific methodological considerations with each approach. We consider specific examples where the application of these approaches has uncovered effects of gene-environment interactions relevant to drug response and immunity, and we highlight how such improvements enable a greater understanding of the pathogenesis of disease and the realization of precision medicine.

%B Am J Epidemiol %V 186 %P 771-777 %8 2017 Oct 01 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28978191?dopt=Abstract %R 10.1093/aje/kwx229 %0 Journal Article %J Nat Genet %D 2017 %T Population- and individual-specific regulatory variation in Sardinia. %A Pala, Mauro %A Zappala, Zachary %A Marongiu, Mara %A Li, Xin %A Davis, Joe R %A Cusano, Roberto %A Crobu, Francesca %A Kukurba, Kimberly R %A Gloudemans, Michael J %A Reinier, Frederic %A Berutti, Riccardo %A Piras, Maria G %A Mulas, Antonella %A Zoledziewska, Magdalena %A Marongiu, Michele %A Sorokin, Elena P %A Hess, Gaelen T %A Smith, Kevin S %A Busonero, Fabio %A Maschio, Andrea %A Steri, Maristella %A Sidore, Carlo %A Sanna, Serena %A Fiorillo, Edoardo %A Bassik, Michael C %A Sawcer, Stephen J %A Battle, Alexis %A Novembre, John %A Jones, Chris %A Angius, Andrea %A Abecasis, Gonçalo R %A Schlessinger, David %A Cucca, Francesco %A Montgomery, Stephen B %K Alternative Splicing %K Chromosome Mapping %K Family Health %K Female %K Gene Expression Profiling %K Genetic Predisposition to Disease %K Genetic Variation %K Genetics, Population %K Genome-Wide Association Study %K Genotype %K Humans %K Italy %K Male %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Transcription Initiation Site %X

Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.

%B Nat Genet %V 49 %P 700-707 %8 2017 May %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/28394350?dopt=Abstract %R 10.1038/ng.3840