I'm trying to find intron-spanning reads (which should be the same as exon-exon spanning reads) to find 'real' coding transcripts. What software/programs can be used to achieve this?
I'm trying to find intron-spanning reads (which should be the same as exon-exon spanning reads) to find 'real' coding transcripts. What software/programs can be used to achieve this?
There are a ton of threads with the same/similar question. Here's a good place to start but with a bit of googling you can find several more threads.
@Trivas a trick is to simply paste the link for biostar thread you want to refer to. Biostar expands the title automatically. e.g. How To Extract Spliced Rnaseq Reads
I want to generate counts for exon-exon spanning reads only and want to use HTSeq count for this if possible. I'm on the Galaxy website and can provide a BAM file of my sample, but as far as I can see, none of the GENCODE GTF files contains information on whether certain reads are exon-exon spanning reads. So how should I do this?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
one hacky way to do it could be to filter only reads that have a skip e.g. a N in the cigar string...grep on the CIGAR column for N
Yes I think this is the solution because spliced reads are the same as exon-exon spanning reads! I hadn't thought of that, thanks!
Are you worried about DNA contamination?
yes exactly