read coverage and transcripts
3
0
Entering edit mode
7.2 years ago
qudrat ▴ 100

Can somebody suggest me how to obtain read coverage of a particular transcript in a transcript assembly done by TopHat/Cufflinks?

RNA-Seq sequencing • 2.4k views
ADD COMMENT
0
Entering edit mode

a transcript assembly done by TopHat/Cufflinks?

Correct me if I'm wrong, but are you sure this is an assembly? TopHat is an aligner, not an assembler.

ADD REPLY
0
Entering edit mode

@WouterDeCoster, TopHat is an aligner and Cufflinks is used for assembly.

ADD REPLY
0
Entering edit mode

Right, but it doesn't modify the alignment, does it? You want to assess the coverage in the alignment. An assembly doesn't have a coverage.

ADD REPLY
0
Entering edit mode

Yes I want to assess the coverage in alignment. Actually in my question, I wanted to make it clear that what software I used

ADD REPLY
0
Entering edit mode

Alright very well. Excuse me for my pedantic nitpicking here, but correct terminology is quite important for quickly getting the right answers.

ADD REPLY
1
Entering edit mode
7.2 years ago
Renesh ★ 2.2k

Here, you need to find the number of reads that are mapped to the given transcripts. For this purpose you need mapped BAM file, the transcript chromosome number and it's start-end co-ordinates. If you have this information, you can find read coverage using samtools as follows

samtools view BAM_file Chromosome:start-end

It will give you all reads that mapped to given transcript region. To count the read coverage, you can pipe wc -l command

samtools view BAM_file Chromosome:start-end | wc -l

ADD COMMENT
0
Entering edit mode

Renesh, what if a gene has more than one transcript and each transcript has same co-ordinate as in case of exon skipping or intron retention. How one would find the coverage for each different transcript?

ADD REPLY
1
Entering edit mode
7.2 years ago
Hussain Ather ▴ 990

Alongside samtools, bedtools offers ways of reading Tophat/cufflinks output too http://bedtools.readthedocs.io/en/latest/index.html

ADD COMMENT
1
Entering edit mode
7.2 years ago

In addition to the tools mentioned earlier, commonly used tools for counting reads in a genomic interval (gene, exon,...) are featureCounts and htseq-counts. I would recommend featureCounts, because it's very fast, has convenient options, but htseq-counts also works fine. Both are nicely documented.

ADD COMMENT

Login before adding your answer.

Traffic: 2711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6