I'm exploring the results from Cuffdiff:
.gene_differential_expression_testing
.transcript_differential_expression_testing
On some genes I see multiple entries in the 2) and with opposite log2 direction as:
TMEM51 chr1:15479027-15546974 A B OK 0.000151584 0.563561 11.8602
TMEM51 chr1:15479027-15546974 A B OK 0.741354 3.92E-05 -14.2062
TMEM51 chr1:15479027-15546974 A B OK 2.39194 0.460979 -2.37541
Moreover the gene TMEM51 is missing in 1)......is that normal?
In the 2) are these multiple gene names are different isoforms, if yes how to know if the isoform is new or known one?
to be honest, I don't use the tuxedo pipeline if I can avoid it. If I'm looking for gene expression, I'd use htSeq_Count -> DESeq2, Transcript expression I'd use Salmon/Kallisto and EBSeq (Though EBSeq is pretty restrictive on the model you can provide, but then again so is Tuxedo).
If they're known isoforms, then biologically they can be absolutely relevant, isoform switching can certainly happen. When you get an RNA seq sample, you're looking at a snapshot of what's happening biologically at a given moment, which means isoform switching could have occurred at different times, but is still captured in the sample.
It depends what you want to look at really. I disagree with Tuxedo's gene level methodology by using 'windows' to collapse everything down, and it's very difficult to understand how it calls 'novel' transcripts. If I was forced to use Tuxedo, I'd probably do a run looking for all known features and use the transcripts file to identify regions of interest (isoform switching events), and use the gene level stuff to make sure everything was as I expected it to be.