RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID.

TitleRAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID.
Publication TypeJournal Article
Year of Publication2021
AuthorsNaseri, A, Shi, J, Lin, X, Zhang, S, Zhi, D
JournalPLoS Genet
Volume17
Issue1
Paginatione1009315
Date Published2021 01
ISSN1553-7404
KeywordsBiological Specimen Banks, Genome, Human, Genome-Wide Association Study, Genotyping Techniques, Haplotypes, Humans, Models, Genetic, Pedigree, Polymorphism, Single Nucleotide
Abstract

Inference of relationships from whole-genome genetic data of a cohort is a crucial prerequisite for genome-wide association studies. Typically, relationships are inferred by computing the kinship coefficients (ϕ) and the genome-wide probability of zero IBD sharing (π0) among all pairs of individuals. Current leading methods are based on pairwise comparisons, which may not scale up to very large cohorts (e.g., sample size >1 million). Here, we propose an efficient relationship inference method, RAFFI. RAFFI leverages the efficient RaPID method to call IBD segments first, then estimate the ϕ and π0 from detected IBD segments. This inference is achieved by a data-driven approach that adjusts the estimation based on phasing quality and genotyping quality. Using simulations, we showed that RAFFI is robust against phasing/genotyping errors, admix events, and varying marker densities, and achieves higher accuracy compared to KING, the current leading method, especially for more distant relatives. When applied to the phased UK Biobank data with ~500K individuals, RAFFI is approximately 18 times faster than KING. We expect RAFFI will offer fast and accurate relatedness inference for even larger cohorts.

DOI10.1371/journal.pgen.1009315
Alternate JournalPLoS Genet
PubMed ID33476339
PubMed Central IDPMC7853505
Grant ListR01 HG010086 / HG / NHGRI NIH HHS / United States
R35 CA197449 / CA / NCI NIH HHS / United States
U19 CA203654 / CA / NCI NIH HHS / United States
MC_QA137853 / MR / Medical Research Council / United Kingdom
MC_PC_17228 / MR / Medical Research Council / United Kingdom
U01 HG009088 / HG / NHGRI NIH HHS / United States