Calculate RNAseq coverage of a transcript
1
2
Entering edit mode
7.0 years ago

Hi Biostars community,

I have bam files with RNA-seq results from ~1000 of single cell samples. For every transcript in each sample, I would like to calculate what percentage of spliced transcript's length is covered by aligned reads. Besides, It would be great to have this value for coding part of transcripts only, or, ideally, for every exon.

What would be the most straightforward way to do this?

Thanks a lot!

RNA-Seq transcriptome • 3.6k views
ADD COMMENT
2
Entering edit mode

To rephrase: per sample, you want to count how many reads overlap with an exon? In that case: featureCounts. Calculating a percentage afterward shouldn't be too hard, e.g. using R/Python/....

ADD REPLY
2
Entering edit mode

Agree with @WouterDeCoster, you can try to use featureCounts. You need to get an annototation file (GTF file), in featureCounts use isGTFAnnotationFile = TRUE (to set up your GTF file), GTF.featureType = "exon" (for read summarization), GTF.attrType = "exon_id" (to group features).

Otherwise, to quantify abundances of transcripts from RNA-Seq data, Kallisto could be an other way to do the trick (check this here https://pachterlab.github.io/kallisto/manual).

ADD REPLY
1
Entering edit mode
7.0 years ago

Thanks to WouterDeCoster and Bastien Herve for their answers!

I have found out that bedtools does exactly what I need:

bedtools coverage -hist -abam "your bam file" -b "your bed file" > "file with results"

ADD COMMENT

Login before adding your answer.

Traffic: 2010 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6