Question

How to calculate TIN score of the GSE dataset to perform some post quality checks on RNASeq dataset

0

Entering edit mode

5.7 years ago

naseerkhan861 ▴ 10

I am working on the GSE 102741 dataset, I have both Raw Gene Count data and log2RPKM dataset , I want to assess the quality of the dataset using PCA analysis by following Paper, How can I calculate the TIN score ? Are there some online tools? can somebody guide me how can we assess the quality of the dataset or give some better suggestions or guidelines as to how to perform the quality analysis on the dataset about Raw Gene Count or log2RPKM counts?

RNA-Seq Quality • 2.3k views

ADD COMMENT • link updated 5.7 years ago by i.sudbery 21k • written 5.7 years ago by naseerkhan861 ▴ 10

score 3 · Accepted Answer · 2019-10-28

3

Entering edit mode

5.7 years ago

i.sudbery 21k

The material and methods of paper you link says:

The quality of the RNA-seq data was measured using the transcript integrity number (TIN) score calculated by RSeQC (version 2.6.4; tin.py) (http://rseqc.sourceforge.net/#tin-py)

I would start there.

ADD COMMENT • link 5.7 years ago by i.sudbery 21k

0

Entering edit mode

But the tin.py is quite slow in computing TIN as it processes transcripts sequentially. I have a large BAM file of ~35 GB. It took 18 hours to process that. Is there a way to speed it up using multithreading or multiprocessing?

ADD REPLY • link 4.8 years ago by Abhishek Shakya • 0