I want to plot Depth of Coverage values in R. I've extracted all the Coverage values from my VCF and want to plot it. The x-axis would be Depth and the y-axis would be counts. I don't have the counts specified I just have a .txt file with a column of values.
In R:
library(ggplot2)
filename <- "KMM1_raw_variants_DP_values_10102018.txt"
my_data <- read.csv(filename, sep="\t", header=FALSE)
head(my_data)
V1 1 350 2 432 3 431 4 479 5 469 6 410
names(my_data)[1] <- c("Coverage")
Coverage 1 350 2 432 3 431 4 479 5 469 6 410
ggplot(my_data,aes(x=Coverage, y=counts)) + geom_line()
Error in FUN(X[[i]], ...) : object 'counts' not found
How do you specify that you want R to count the times each coverage is present so that I can see the Depth of Coverage so I can know what to filter out when using GATK? First graph here: http://mbontrager.org/blog/2016/08/17/Variant-Exploration
Thanks!!
When I do that I get this error:
Error: StatBin requires a continuous x variable: the x variable is discrete. Perhaps you want stat="count"?
are you sure that the entries of
my_data$Coverage
are numbers? what doesstr(my_data)
return?'data.frame': 725952 obs. of 1 variable: $ V1: num 478 569 568 620 609 545 242 240 229 346 ...