Hi,
I have a list of CNVs called using XHMM from exomes. I would like to know the best way to visualize these CNVs. I hope to get a reply soon.
Hi,
I have a list of CNVs called using XHMM from exomes. I would like to know the best way to visualize these CNVs. I hope to get a reply soon.
Hi @nkausthu,
I've not used XHMM myself, but if you can load the data into R, you can plot the data using karyoploteR.
I've created a small simulated results using the .xcnv as decribed in the XHMM tutorial using the createRandomRegions
function from regioneR with the following code:
library(karyoploteR)
set.seed(1231)
#Simulate output
xhmm.out <- data.frame()
for(nsample in 1:10) {
dd <- toDataframe(createRandomRegions(nregions=10, length.mean = 20e6, length.sd=5e6, mask=NA, non.overlapping = TRUE))
intervals <- paste0(dd[,1], ":", dd[,2], "-", dd[,3])
xhmm.out <- rbind(xhmm.out, data.frame(SAMPLE=paste0("Sample", nsample),
CNV=rep(c("DEL", "DUP"), 5),
INTERVAL=intervals, stringsAsFactors=FALSE))
}
That creates a data.frame with the columns we need to plot
> head(xhmm.out)
SAMPLE CNV INTERVAL
1 Sample1 DEL chr18:29077044-48674484
2 Sample1 DUP chr2:168227823-191704249
3 Sample1 DEL chr5:163255214-174457397
4 Sample1 DUP chr20:14003146-35810966
5 Sample1 DEL chr12:29226733-59642338
6 Sample1 DUP chr1:160156036-178393920
And we can start plotting. In this case we'll plot all chromosomes in a single line and represent each sample independently with gains in red and losses in green.
samples <- as.character(unique(xhmm.out$SAMPLE))
num.samples <- length(samples)
kp <- plotKaryotype(plot.type = 4, ideogram.plotter = NULL, labels.plotter = NULL)
kpAddChromosomeNames(kp, srt=45)
kpAddCytobandsAsLine(kp)
for(nsample in seq_len(num.samples)) {
s <- samples[nsample]
#Extract and prepare the data of the current sample
sample.regions <- xhmm.out[xhmm.out$SAMPLE==s,]
regs <- data.frame(do.call(rbind, strsplit(x = sample.regions$INTERVAL, split = c(":|-"))), stringsAsFactors=FALSE)
regs[,2] <- as.numeric(regs[,2])
regs[,3] <- as.numeric(regs[,3])
regs <- toGRanges(regs)
#And plot
#Define the vertical space for the sample "track"
r0 <- (nsample-1)/num.samples
r1 <- (nsample)/num.samples - 0.05 #this 0.05 is a small margin between samples
#Add the labels, etc...
kpAddLabels(kp, r0=r0, r1=r1, labels = s)
kpAbline(kp, h=0.5, r0=r0, r1=r1, col="#888888")
#And plot the regions DUP and DEL
kpSegments(kp, data=regs[sample.regions$CNV=="DUP"], y0 = 0.5, y1=0.5, r0=r0, r1=r1, col="red", lwd=5)
kpSegments(kp, data=regs[sample.regions$CNV=="DEL"], y0 = 0.5, y1=0.5, r0=r0, r1=r1, col="green", lwd=5)
}
And you'll get something like this
You can change the chromosome arrangement and the representation of the DUPs and DELs using any of the available plotting functions you can fin in the karyoploteR tutorial and examples page.
The XHMM manual has a section titled "Visualizing XHMM results with included R scripts". Also, if you output the results to vcf, you can explore them with IGV.
Thank you so much for the reply. I had tried the XHMM tutorial but an error as follows. ./pseq --group refseq --locdb locdb --group refseq --file /mnt/exome/Softwares/HG19/nexterarapidcapture_exome_target_region.bed --out annotated_targets.refseq
pseq error : command refseq not recognised
Also I tried to load VCF file generated by XHMM in IGV but again I get an error.
Error loading DATA.vcf: The provided VCF file is malformed at approximately line number 34: Symbolic alleles not allowed as reference allele: , for input source: DATA.vcf
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi,
This plot looks like I will be able to get an overview about the deletions and duplication called in each of the sample. Actually I would like to get a plot as below for a specific duplication/deletion. I think it is based on the z-score given by xhmm.
Hi nkausthu,
I cannot see the image attached. Can you fix the link?
Please see this link https://ibb.co/dzFKVH
For an image like this you can also use karyopoteR, but use the
kpPoints
function instead ofkpSegments
. You'll need to figure out the values they are plotting (the Z-score it seems) in your output files, load it into R and plot them. You can take a look at the example on how to plot SNP-array data for inspiration on a similar plot. If you want a plot of a smaller region, not the whole genome in a single plot, simply give the region as thezoom
parameter toplotKaryotype
. You can find more info in the tutorial on how to zoom in a small region of the genome.Thank you so much.. I will explore this and get back to you