%0 Journal Article %J Nat Genet %D 2020 %T Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. %A Li, Xihao %A Li, Zilin %A Zhou, Hufeng %A Gaynor, Sheila M %A Liu, Yaowu %A Chen, Han %A Sun, Ryan %A Dey, Rounak %A Arnett, Donna K %A Aslibekyan, Stella %A Ballantyne, Christie M %A Bielak, Lawrence F %A Blangero, John %A Boerwinkle, Eric %A Bowden, Donald W %A Broome, Jai G %A Conomos, Matthew P %A Correa, Adolfo %A Cupples, L Adrienne %A Curran, Joanne E %A Freedman, Barry I %A Guo, Xiuqing %A Hindy, George %A Irvin, Marguerite R %A Kardia, Sharon L R %A Kathiresan, Sekar %A Khan, Alyna T %A Kooperberg, Charles L %A Laurie, Cathy C %A Liu, X Shirley %A Mahaney, Michael C %A Manichaikul, Ani W %A Martin, Lisa W %A Mathias, Rasika A %A McGarvey, Stephen T %A Mitchell, Braxton D %A Montasser, May E %A Moore, Jill E %A Morrison, Alanna C %A O'Connell, Jeffrey R %A Palmer, Nicholette D %A Pampana, Akhil %A Peralta, Juan M %A Peyser, Patricia A %A Psaty, Bruce M %A Redline, Susan %A Rice, Kenneth M %A Rich, Stephen S %A Smith, Jennifer A %A Tiwari, Hemant K %A Tsai, Michael Y %A Vasan, Ramachandran S %A Wang, Fei Fei %A Weeks, Daniel E %A Weng, Zhiping %A Wilson, James G %A Yanek, Lisa R %A Neale, Benjamin M %A Sunyaev, Shamil R %A Abecasis, Gonçalo R %A Rotter, Jerome I %A Willer, Cristen J %A Peloso, Gina M %A Natarajan, Pradeep %A Lin, Xihong %K Cholesterol, LDL %K Computer Simulation %K Genetic Predisposition to Disease %K Genetic Variation %K Genome %K Genome-Wide Association Study %K Humans %K Models, Genetic %K Molecular Sequence Annotation %K Phenotype %K Whole Genome Sequencing %X

Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.

%B Nat Genet %V 52 %P 969-983 %8 2020 09 %G eng %N 9 %1 https://www.ncbi.nlm.nih.gov/pubmed/32839606?dopt=Abstract %R 10.1038/s41588-020-0676-4 %0 Journal Article %J Am J Hum Genet %D 2019 %T ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies. %A Liu, Yaowu %A Chen, Sixing %A Li, Zilin %A Morrison, Alanna C %A Boerwinkle, Eric %A Lin, Xihong %X

Set-based analysis that jointly tests the association of variants in a group has emerged as a popular tool for analyzing rare and low-frequency variants in sequencing studies. The existing set-based tests can suffer significant power loss when only a small proportion of variants are causal, and their powers can be sensitive to the number, effect sizes, and effect directions of the causal variants and the choices of weights. Here we propose an aggregated Cauchy association test (ACAT), a general, powerful, and computationally efficient p value combination method for boosting power in sequencing studies. First, by combining variant-level p values, we use ACAT to construct a set-based test (ACAT-V) that is particularly powerful in the presence of only a small number of causal variants in a variant set. Second, by combining different variant-set-level p values, we use ACAT to construct an omnibus test (ACAT-O) that combines the strength of multiple complimentary set-based tests, including the burden test, sequence kernel association test (SKAT), and ACAT-V. Through analysis of extensively simulated data and the whole-genome sequencing data from the Atherosclerosis Risk in Communities (ARIC) study, we demonstrate that ACAT-V complements the SKAT and the burden test, and that ACAT-O has a substantially more robust and higher power than those of the alternative tests.

%B Am J Hum Genet %V 104 %P 410-421 %8 2019 Mar 07 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/30849328?dopt=Abstract %R 10.1016/j.ajhg.2019.01.002