R&Bioconductor: Extract coverage from bamfile given GRanges object?
1
Given a .bamfile and a GRanges object with locations of exons in a specific transcript, how can I extract coverage for only the regions in the GRanges object?
My end goal is to plot coverage for exons in a specific transcript.
R
bioconductor
• 2.9k views
I think I figured it out:
library(Rsamtools)
library(RNAseqData.HNRNPC.bam.chr14)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
# Get reference to example bamfile
bamfile <- RNAseqData.HNRNPC.bam.chr14_BAMFILES[1]
# Exmple transcript name
transcript <- "uc001yhh.1"
# Get exons for transcript
allExons <- GenomicFeatures::exons(
TxDb.Hsapiens.UCSC.hg19.knownGene,
vals = list(tx_name = transcript),
columns = list("EXONNAME", "TXNAME", "GENEID"))
# Expand GRanges
allExons <- expand(allExons)
# Remove unwanted rows
allExons <- allExons[mcols(allExons)$TXNAME == transcript]
# Get coverage
cov <- coverage(bamfile, param = Rsamtools::ScanBamParam(which = allExons))
# Coverage only for my transcript
cov <- cov[allExons]
# Unlist
cov <- unlist(cov)
# Turn into numeric vector
cov <- as.numeric(cov)
# Plot
plot(cov)
Login before adding your answer.
Traffic: 2431 users visited in the last hour
Careful about what 'coverage' means when exons overlap; maybe you want to ScanBamParam(which=reduce(allExons)) to avoid double counting.
For my real script I do that in an earlier step, but thanks!