Dear all,
I need to align some fastq files to the human genome. I prepared the reference index files and transcriptomes with bowtie and tophat as follows:
bowtie2-build -f GRCh38.84.fa GRCh38.84
tophat2 -p 32 -G GRCh38.84.gtf --transcriptome-index=GRCh38.84.tr GRCh38.84
this created the files GRCh38.84.1.bt2l, GRCh38.84.2.bt2l, GRCh38.84.3.bt2,l GRCh38.84.rev.1.bt2l,
GRCh38.84.4.bt2l, GRCh38.84.rev.2.bt2l and the GRCh38.84.tr folder with the tophat's files.
I removed the Illumina adapters with trimmomatic from the input files:
java -jar /usr/bin/trimmomatic.jar PE -threads 16 -phred33 input1.fastq input2.fastq i1_paired.fastq i2_paired.fastq i1_unpaired.fastq i2_unpaired.fastq ILLUMINACLIP:./IlluminaTags/TruSeq_RNA.fa:2:30:10:1:true
then I ran Tophat:
tophat2 -o outputFolder -G GRCh38.84.gtf -p 32 --transcriptome-index=GRCh38.84.tr -p 32 i1_paired.fastq i2_paired.fastq
but the output was:
[2016-04-03 11:19:42] Beginning TopHat run (v2.1.1)
-----------------------------------------------
[2016-04-03 11:19:42] Checking for Bowtie
Bowtie version: 2.2.6.0
[2016-04-03 11:19:43] Checking for Bowtie index files (transcriptome)..
[2016-04-03 11:19:43] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (i1_paired.fastq.*.bt2l)
I also provided the unpaired files with
tophat2 -o outputFolder -G GRCh38.84.gtf -p 32 --transcriptome-index=GRCh38.84.tr -p 32 i1_paired.fastq, i1_unpaired.fastq i2_paired.fastq, i2_unpaired.fastq
and the untrimmed files:
tophat2 -o outputFolder -G GRCh38.84.gtf -p 32 --transcriptome-index=GRCh38.84.tr -p 32 input1.fastq input2.fastq
but the result was the same.
What I am getting wrong? Do I really need to index also the query files with bowtie? But in that case, what would be the use of tophat? And against what should I index the files? The human genome?
Thank you
L
Are GRCh38.84.tr index files located in the current folder? If not you will need to provide the full (or relative path) to the folder containing those files.