I am running tophat/2.1.1 and using bowtie/2.2.9. I run the following commands, and get the following errors.
tophat -G reference_files/Homo_sapiens.GRCh38.86.gtf -p4 -o tophat_results/ reference_files/GRCh38 data/S_01_L003_R1.fastq
And I get the following error:
[2016-11-03 16:39:02] Beginning TopHat run (v2.1.1)
[2016-11-03 16:39:02] Checking for Bowtie
Bowtie version: 2.2.9.0
[2016-11-03 16:39:02] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (reference_files/GRCh38.*.bt2l)
I only have .bt2 files.
You did not specify the basename of your bowtie2 index files on your command line. Try the following command:
That does not work. I get the same error Error: Could not find Bowtie 2 index files (reference_files/GRCh38.*.bt2l)
What's the name of the index files you created?
GRCh38.no_alt_analysis_set.fna.bowtie_index.1.bt2l (then the same for 2,3,4 and rev.1,rev.2)
Then you need to use
GRCh38.no_alt_analysis_set.fna.bowtie_index
as the index basename.tophat -p4 -G reference_files/Homo_sapiens.GRCh38.86.gtf -o tophat_results/ reference_files/GRCh38.no_alt_analysis_set.fna.bowtie_index reference_files/GRCh38 data/S_01_L003_R1.fastq
A suggestion. Give your files better names to make this easy on yourself and don't keep indexes and data in the same directory. Please replace spaces in directory/file names with "_".
Hello,
When I do this, I get the following error. What is the meaning of 'reference_files/GRCh38' after the 'reference_files/GRCh38.no_alt_analysis_set.fna.bowtie_index'?
It would see that the second 'reference_files/GRCh38' is a typo?
Also - if I need a transcriptome-index file ('--transcriptome-index=transcriptome_data/known/'), would I add that right before 'data/S_01_L003_R1.fastq '?
I copied and pasted the command from your original post since I thought you had the directory/file paths correct. The command as written assumes that you are running it in a directory which contains a subdirectory called reference_files. Is that not the case? Adjust the file paths as they exist on your machine relative to where ever you are running this command from.
Generically do this.
/path_to/
part should be replaced with a real file path on your computer.If you have pre-built the transcriptome index (with a special run of TopHat with -G option) then you could replace that index in place of the whole genome one (like in the example above). You will no longer need to add -G option and would only be mapping to the "known" transcriptome in this case.