Loop tophat to align multiple RNA-seq files
3
Hello,
I'm a beginner! trying to use a loop to align multiple paired-end RNA-seq samples with tophat.
Here's what I have that isn't working (basically using this bash loop for alignment RNA-seq data a little bit modified):
for i in $( ls *.fastq | rev | cut -c 13- | rev | uniq )
do
tophat -o /path/to/output/${i} -G /path/to/genes.gtf /path/to/index/genome ${i} R1_001.fastq ${i} R2_001.fastq
done
The error is this:
Why doesn't it realize that ${i} is part of the filename?
Thanks in advance.
Emma
RNA-Seq
tophat
loop
alignment
• 5.3k views
Add echo $i
to your loop so you can see what $i
is. Maybe that command is not returning what you think it should.
if you are aligning fastq files on a server (and I expect you do), then it is much faster to run them in parallel instead of sequentially.
The error message you get suggests that bash cannot find any fastq files in your current directory, so I guess the first step would be to make sure that the fastq files are in the same directory from which you're running your script. Also make sure they're not actually .fastq.gz
If all the above is legit, you could also try this:
for file in ./*R1_001.fastq
do
FBASE= $( basename $file .fastq)
BASE= ${FBASE%R1_001}
tophat -o /path/to/output/${BASE} -G /path/to/genes.gtf /path/to/index/genome ${BASE} R1_001.fastq ${BASE} R2_001.fastq
done
Login before adding your answer.
Traffic: 1427 users visited in the last hour
Can you show us example file names for one sample?
Sure, here's an example:
See if this helps
Thanks for the suggestion! Still get the same error though.
Curious about what OS/shell are you using?
can you just try
to see what it prints out and can you also try
to see whether the output from above matches the file names?
edit...I just realized that it piggy backs on Igor's answer :)
edit2...try also to use the whole path for your fastq files, i.e.: