Single rare variant approaches
2
0
Entering edit mode
6.7 years ago
Tash • 0

Hi all,

I am new to this forum - forgive me if this is a very basic question.

I am currently analysing a WES dataset, and have identified 4 rare variants of interest in the same gene (a total of 8 cases share mutations in this gene in a sample of 50 cases). Two of the variants have a MAF < 1%, while the other two are around 1.2%.

I have read that it may be appropriate to perform a burden association test in PLINK /SEQ for the rarer variants in this gene to obtain a score of 'mutational load' (I guess to increase power), but I am hoping to get some general guidance in what would be the best approach to analyse single rare variants?

Thanks!

next-gen • 4.2k views
ADD COMMENT
3
Entering edit mode
6.7 years ago
NB ▴ 960

You can try Variant tools that includes SKAT as mentioned by @WouterDeCoster . These can take vcf files as input . There are tutorials on the website which can help find your way with these tests. otherwise if your comfortable to try R, have a look at SKAT and R

ADD COMMENT
1
Entering edit mode

Oh yeah, that might be easier than going in R yourself and less painful than what I described to do manually.
But I like my analysis authentic, artisanal, GMO and gluten-free ;-)

ADD REPLY
5
Entering edit mode
6.7 years ago

SKAT and SKAT-O (in R) are two commonly used tools for rare variant association testing, using also more sophisticated models than simple burden test.

ADD COMMENT
0
Entering edit mode

Appreciate your reply.

Do you know of any good example data I could download to get a feel for how to perform SKAT in R? To me it is not completely straightforward what the input file should entail (again... newbie), but it might help to see something in front of me.

ADD REPLY
4
Entering edit mode

I usually use plink to generate a .raw file:

plink --vcf yourvariants.vcf --out projectname
plink --hwe 0.001 --bfile projectname --max-maf 0.01 --recode A --out projectname

Make sure to use the right .fam file, fill in conditions (phenotype)

And then use the following Rscript, saved as e.g. SKAT.R:

Rscript SKAT.R projectname.raw

#!/usr/bin/Rscript

library('SKAT')

cat("WARNING: this script doesn't perform a sanity check on your data and therefore it's your own responsibility to provide correct input data.\n")
args = unlist(strsplit(commandArgs(trailingOnly = TRUE)," "))
basename = strsplit(basename(args), "\\.")[[1]][1]
dat <- read.table(args[1], header=TRUE, stringsAsFactors=FALSE) 
phen = ifelse(dat$PHENOTYPE == 2, 0, 1) #Convert affected = 2 to affected = 0
geno = as.matrix(dat[,-c(1:6)])
obj <- SKAT_Null_Model(phen~1, out_type="D")
skatT <- SKAT(geno, obj, method="optimal.adj")
cat(skatT$p.value)
cat("\n\n")

warnings = names(as.vector(warnings()))
wardf <- data.frame(V1=paste("warning", 1:length(warnings), ':', sep=""), V2=warnings)
output = rbind(
    data.frame(
        V1=c("individuals", "controls", "cases", paste("pval for rho=", skatT$param$rho, ":", sep=""), "test_p:"),
        V2=c(length(phen), table(phen)[[2]], table(phen)[[1]], skatT$param$p.val.each, skatT$p.value)),
    wardf)
write.table(output, paste(basename, "_SKAT_details.txt", sep=""), col.names=F, quote=F, row.names=F, sep="\t")
ADD REPLY

Login before adding your answer.

Traffic: 2604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6