How to generate an Upset plot in R to plot the shared variants between cell free DNA samples
1
0
Entering edit mode
6 months ago

Hello,

I have cell free DNA of 5 samples in VCF file format, I want to plot the shared variants between all the samples in an Upset plot. Which parameters do I need to consider as input and how to convert them into binary format. I have converted my files into tsv format.

This is the header of my VCF file:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1_P-gDNA   S1_P-cfDNA  NORMAL   cfDNA

Could some one please help with this?

Thanks.

r Upset • 419 views
ADD COMMENT
0
Entering edit mode
6 months ago
zx8754 12k

Try this example:

library(UpSetR)

#example input, 4 samples, 250 variants
set.seed(1); d <- data.frame(matrix(sample(c("0|0", "0|1", "0|1", "1|1"), 1000, 
                                           replace = TRUE), ncol = 4))
head(d)
#    X1  X2  X3  X4
# 1 0|0 0|0 0|1 0|1
# 2 1|1 1|1 0|1 0|1
# 3 0|1 0|1 0|1 1|1
# 4 0|0 0|0 0|0 0|1
# 5 0|1 0|0 0|0 1|1
# 6 0|0 0|1 0|1 0|1

variants <- seq(nrow(d))

# subset variants that have "1|1", update as needed
dl <- lapply(d, function(i) variants[ grepl("1|1", i, fixed = TRUE) ])
# lengths(dl)
# X1 X2 X3 X4 
# 57 66 66 61 

upset(fromList(dl))

enter image description here

ADD COMMENT
0
Entering edit mode

It worked, Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6