Hi,
I ran the tuxedo pipeline on rna seq data. And I used RSEM/limma for the differential expression analysis.
I noticed something peculiar in the counts matrix from RSEM, some transcripts have exactly the same expression level for all the samples:
"TCONS_00086761" 46.23 46.23 61.40 74.93 46.23 90.77
37.04 76.97 165.92 86.25 30.33 45.82 0.00 68.88 76.96 72.05 14.71 46.14 61.66 15.41 70.10 15.42 85.33 30.66
"TCONS_00086762" 46.23 46.23 61.40 74.93 46.23 90.77
37.04 76.97 165.92 86.25 30.33 45.82 0.00 68.88 76.96 72.05 14.71 46.14 61.66 15.41 70.10 15.42 85.33 30.66
When it happens it is always on consecutive transcripts (TCONS_000004 and TCONS_000005) for example, and they are related to the same gene id (XLOC). I haven't found anything about that on forums or articles.
My guess is that cufflinks misintepreted a single transcript as severals, but I am not sure. And if they were different isoforms of the same gene, they should have different expression, right ?
Does someone already noticed this and have an explanation ?
Thanks, Corentin
Thank you for your answer,
I had a look with IGV and these transcripts are indeed overlapping isoforms (they are almost the same).