Trouble using TopHat (bowtie index genome.*.bt2l)
0
0
Entering edit mode
7.1 years ago

Hello!

I'm just starting to use tophat and I have a little problem which I am not able to solve. I wanna align several human transcriptomes, so I have downloaded the reference human genome (ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/NCBI/GRCh38/Homo_sapiens_NCBI_GRCh38.tar.gz) and now I wanna use tuxedo protocol.

Executing the following commands:

1st. Uncompress the genome:

tar xvfz Homo_sapiens_NCBI_GRCh38.tar.gz

2nd. Make a working directory:

mkdir Alignments

3rd. Create symbolic links to annotation files and bowtie index (inside the working directory):

ln -s /path_to/Hsa38/Annotation/Archives/archive-2015-08-11-09-31-31/Genes/genes.gtf
ln -s /path_to/Hsa38/Sequence/Bowtie2Index/genome.*.

4th. Try to run tophat (inside the working directory):

tophat -p 8 -G genes.gtf -o sample_output --library-type=fr-firststrand genome sample.fq

The output message was the following one:

[2017-11-07 00:29:47] Beginning TopHat run (v2.1.1)
-----------------------------------------------
[2017-11-07 00:29:47] Checking for Bowtie
          Bowtie version:    2.2.9.0
[2017-11-07 00:29:47] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (genome.*.bt2l)

After that I have tried to finde some file with the extension .bt2l: find path_to/Hsa38/ -iname *bt2l but I had not success. Does anyone know where is the index? or how can i solve this trouble?

Thanks in advance.

RNA-Seq rna-seq alignment software error • 3.3k views
ADD COMMENT
1
Entering edit mode

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLY
0
Entering edit mode

Thaks so much! I've just got the publication =D

ADD REPLY
0
Entering edit mode

Just look in the actual folder and see what the files are named so you can adjust the symlinks as necessary. Should genome.*., be genome.*? It's not going to recognize the wildcard as is.

ADD REPLY
0
Entering edit mode

I just have tried it but it does not work:

ln -s /path_to/Hsa38/Sequence/Bowtie2Index/genome.*
ln: target '/path_to/Hsa38/Sequence/Bowtie2Index/genome.rev.2.bt2' is not a directory
ADD REPLY
1
Entering edit mode

Create the symbolic link to the directory, not the file prefix:

ln -s /path_to/Hsa38/Sequence/Bowtie2Index/ symlinkGenome

Then, execute tophat with:

tophat -p 8 -G genes.gtf -o sample_output --library-type=fr-firststrand symlinkGenome/genome sample.fq

As my colleague Wouter has stated, also, tophat/tophat2 is 'retired' and HISAT/HISAT2 is the upgraded version.

Kevin

ADD REPLY

Login before adding your answer.

Traffic: 1313 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6