Question

What burden testing software scales to large cohorts?

0

Entering edit mode

20 months ago

LChart 4.7k

I am analyzing a large cohort (>500K) of case/control data and am attempting to perform variant set associations ("burden tests"). My first attempt was using rvtest, and while it works for smaller cohorts; for the full cohort it fails due to std::bad_alloc() even on a >1T mem machine. Before I start testing alternatives - has anyone on this forum used, or know of a use of, burden testing software besides REGENIE that can be applied to cohorts of this size.

This is targeted sequencing, so REGENIE cannot be applied.

Thanks

burden test rare variants SKAT • 844 views

ADD COMMENT • link 20 months ago by LChart 4.7k

0

Entering edit mode

people working on ukbiobank are using rvtests: https://www.medrxiv.org/content/10.1101/2021.11.04.21265866v1.full.pdf

"Genetic analyses of the QT interval and its components in over 250K individuals identifies new loci and pathways affecting ventricul"...

To test for associations due to a burden of rare (MAF≤ 0.01) variants with functional consequences within protein-coding genes, additional analyses were performed by participating cohorts using Rare variant test (Rvtests) (version 2.0.6 or later)8

ADD REPLY • link 20 months ago by Pierre Lindenbaum 165k

0

Entering edit mode

I'm not sure that's 100% accurate; and at any rate it's not likely to be helpful in my case due to the following:

#1: rvtests is explicitly not working for my use-case.

#2: There is an issue open in rvtests demonstrating it does not work for UKBB exomes: https://github.com/zhanxw/rvtests/issues/145

#3: The actual "N" used in your linked paper is in Table 2 on pp 46, and it is <200,000

#4: rvtests wasn't even used for the full association testing in your linked paper; but only to generate per-subcohort scores. rareMETALS was used for the association: "Score and covariance files used as input for gene-based meta-analyses in rareMETALS were generated using Rvtests as described above" (pp 24), indicating that when rvtests was run, the "N" was even lower than the values listed on pp 46

tl; dr I don't think that rvtests properly scales; and has not been directly applied (i.e., in a one-shot fashion) UKBB or UKBB-size cohorts. Please let me know if you're familiar with software that is actually capable of this.

ADD REPLY • link 20 months ago by LChart 4.7k