Entering edit mode
6.2 years ago
yaminivadapally
•
0
How many output files are generated in tophat2 alignment if i have given 5 input reads with reference genome for alignment, will i get 5 output files for every read or only a single file?
You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon followed by DESEq2 or edgeR.
While I've heard this opinion a lot, I don't think this is necessarily true:
1) I know of at least one project where I needed to use a TopHat2 alignment to recover a known splicing event. So, there may be rare cases where it is helpful to use TopHat, particularly if the downstream program is developed with TopHat-formatted alignments.
2) For gene expression, every time that I've started with a TopHat2 alignment and had some sort of unexpected gene expression pattern, the trend for normalized gene expression has been essentially the same with a STAR re-alignment (no dramatic difference, such as a different qualitative trend). I'm sure there are exceptions for other genes and/or other applications, but I want to emphasize that the TopHat2 alignment is often OK.
That said, it is good to know that there are more recent aligners (where I have a slight preference for STAR over HISAT).