I have two questions regarding running Tophat:
(1) At the step “Searching for junctions via segment mapping”, it takes a really long time, and I got the following message:
“Coverage-search algorithm is turned on, making this step very slow
Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory.”
I’d like to know what exactly the differences between “coverage-search” and “no coverage-search” are. If I use “--no-coverage-search” option, what impact it may have on the Tophat results and accuracy?
(2) I use –G option to provide gene model annotation GTF file (genes.gtf). I notice that for each Tophat run, it builds bowtie index for genes.gtf on-the-fly:
“Building Bowtie index from genes.fa”
This step takes two hours (I use the main annotation gtf file for human from GENCODE).
I have 3 conditions and each condition has 3-4 pairs of fastq reads, so I have 10 Tophat runs in my script. This “Building Bowtie index from genes.fa” step was executed for 10 times even though they all use the same genes.gtf file.
So I am wondering if there is a way to let Tophat re-use the bowtie index for genes.gtf produced from the first run in the subsequent runs?
I’d greatly appreciate any ideas and suggestions.
Thank you very much!
Thank you very much for your advice!
This is really helpful!