I have been trying to use the Gencode basic GTF annotation downloaded from UCSC table browser with Tophat2. I keep getting the following error Warning: TopHat did not find any junctions in GTF file
. I have also downloaded the same version from Gencode, which was accepted by Tophat. However, the number of lines in the two files is wildly different:
283548 from UCSC
1639850 from Gencode (I create a 'basic' version myself by grep'ing 'basic')
Q1 - Has anyone managed to use UCSC Gencode GTF data with Tophat? Q2 - Why are the contents of the two sources so different?
I have spent years running ChIP-seq analysis, but RNA-seq is new to me so I may be missing something obvious! If you are asking yourself "why is he using Gencode basic?" it is because I used this set with ChIP-seq data.