We have Illumina SE RNA-Seq reads that we are playing around with to learn mapping of transcripts to reference genome.
Pre-processing was performed as multiple steps mentioned in the BBmap suite.
Mapping to a reference genome was performed using STAR aligner.
To figure how good the RNA integrity of that sample is, we are trying to use tin.py from the RSEQC package. It runs OK, and the results we get for our test mapping, AFTER pre-processing steps of BBmap, are as follows:
Bam_file TIN(mean) TIN(median) TIN(stdev)
DEFAULT1_Feb2020Aligned.sortedByCoord.out.bam 66.5226846932 70.1270683349 15.6310316213
Our questions are these:
1. Should we have tried tin.py on BAM results of raw reads maped to ref. genome, BEFORE any of the pre-processing steps ?
2. Or tried AFTER SOME of those pre-processing steps?
3. Or using BAM from mapping fully pre-processed reads to ref genome is OK to check using tin.py ?
4. Given TIN values can be 0-100, is the result shown above kinda poor, or good enough to use for downstream steps like DGE?
Thanks, in advance, for insightful answers from forum members!
Have you considered checking out the 2016 BMC Bioinformatics paper that describes TIN and it's relation to RIN? It should answer almost all your questions if not all of them :)
Were you able to speed up
tin.py
? It is horrendously slow.Have to look up my notes, how long is it taking you? Have you tried a test input file and / or contacting the author(s)?