How to plot copy number variation profile in R?
1
1
Entering edit mode
5.0 years ago
reui879 ▴ 10

I'm trying to plot a copy number variation profile plot in R. This is what I'm looking for but with all the cells in my data.

https://i.stack.imgur.com/hvboX.png

Ploidy is on the Y axis and chromosome number is on the X axis

This is my data and this is what I've tried so far but it's not giving me what I'm looking for

input <- data.frame(chrom = sample("chr1"),start = sample(c(780000, 2920000, 4920000)), stop=sample(c(2920000, 4920000, 692000)), cell0=sample(1), cell1=sample(1,3,1),cell2=sample(2,1,2)


ggplot(input, aes(x=chrom, y=cell_0, group=1)) +
  geom_point() +
  geom_line(color = "#00AFBB", size = 1)
R CNV ggplot • 11k views
ADD COMMENT
0
Entering edit mode

Thank you! When I plot

plotLRR(kp, tiles, ymin=0, ymax=5, add.axis = FALSE, labels=NA)

with my own data I get this error

Error: subscript contains invalid names

ADD REPLY
9
Entering edit mode
5.0 years ago
bernatgel ★ 3.4k

Hi @reui879

You can achieve something similar to the image you linked with the Bioconductor packages karyoploteR and CopyNumberPlots.

The data you provide is too minimal to create a meaningful plot, so I'll start by creating a more realistic looking sample data. To do that, we'll take advantage of the createRandomRegions function from the package regioneR. We'll first create the segmentation of the genome and assign random copy numbers to it and then create the "data points" based on these copy numbers.

library(regioneR)
library(karyoploteR)
library(CopyNumberPlots)
#Create example data
#Randomly partition the genome
hg19 <- filterChromosomes(getGenome("hg19"), organism = "hg", chr.type = "canonical")
rr <- regioneR::createRandomRegions(nregions = 15, length.mean = 40e6, length.sd = 30e6, genome=hg19, mask=NA, non.overlapping = TRUE)                    
rr <- sort(c(rr, subtractRegions(hg19, rr)))

#Assign copy number states
rr$cn <- sample(size = length(rr), x = c(0,rep(1,2), rep(2,10), rep(3,3), rep(4,2)), replace = TRUE)

#Create the points (not very efficient nor elegant, but works)
tiles <- unlist(tileGenome(setNames(end(hg19), seqnames(hg19)), tilewidth = 2e6))
tiles$lrr <- rnorm(length(tiles), mean = 2, sd = 0.3)
for(i in 1:length(tiles)) {
  tiles$lrr[i] <- tiles$lrr[i] + subsetByOverlaps(rr, tiles[i])$cn - 2
}

And now we can start plotting. We'll create first a karyoplot with the function plotKaryotype (setting the plot.type to 4 to get all chromosomes in a single line). Since the data points are equivalent to the LRR signal in a SNP-array, we can use the plotLRR function from CopyNumberPlots (or we could directly use kpPoints and customize it). Finally, to plot the copy number lines we'll use the function PlotCopyNumberCallsAsLines from the CopyNumberPlots package.

kp <- plotKaryotype("hg19", plot.type = 4, main="Random Data, 2Mb resolution", cex=2)
plotLRR(kp, tiles, ymin=0, ymax=5, add.axis = FALSE, labels=NA)
plotCopyNumberCallsAsLines(kp, cn.calls = rr, ymin=0, ymax=5, labels="ploidy")

With these three lines of code we'd get an image like this

enter image description here

With this as a starting point, we can adjust a few parameters, add chromosome separators, an improved axis and labels and the image will end up looking much more like the one you linked.

kp <- plotKaryotype("hg19", plot.type = 4, labels.plotter = NULL, main="Random Data, 2Mb resolution", cex=3.2)
kpAddChromosomeNames(kp, srt=45, cex=2.2)
plotLRR(kp, tiles, ymin=-1, ymax=5, labels = NA,  points.col = "#AAAAAAAA", line.at.0 = FALSE, points.cex = 3, add.axis = FALSE)
plotCopyNumberCallsAsLines(kp, cn.calls = rr, ymin=-1, ymax=5, lwd=3, add.axis=FALSE, labels = NA)
kpAddChromosomeSeparators(kp, lwd=2, col = "#666666")
kpAxis(kp, ymin = -1, ymax=5, tick.pos = 0:4, cex=2.2)
kpAddLabels(kp, labels = "ploidy", cex=3, srt=90, pos=3, label.margin = 0.025)

enter image description here

There's more information on how to use the karyoploteR in the tutorial page and you can find more information on CopyNumberPlotsin the package vignette.

Hope this helps

ADD COMMENT
1
Entering edit mode

What I could never understand on Biostars is how to vote for the answers using mobile. I like both packages you have listed, but can not just upvote the answer

ADD REPLY
1
Entering edit mode

No idea... I always use it from a PC. Maybe any of the biostars gurus? In any case, if you have any problem using these packages (CopyNumberPlots might still have some rough edges and karyoploteR is still missing docs in some of the newest functionality), feel free to ask :)

ADD REPLY
0
Entering edit mode

Thank you! When I plot

plotLRR(kp, tiles, ymin=0, ymax=5, add.axis = FALSE, labels=NA)

with my own data I get this error

Error: subscript contains invalid names

I have the link to my whole file here https://pastebin.com/440AX3Dr

ADD REPLY
0
Entering edit mode

Ops! That was a bug in the package. It's been fixed and should be available in a couple of days via BiocManager. In the meantime, if you change your column name from "cell_0" to "lrr" it should work

Sorry about that! :)

ADD REPLY

Login before adding your answer.

Traffic: 1624 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6