Hello,
I've got a large number of fastq files generated from a paired end single cell RNAseq experiment.
I'm looking to align them back to mm10 using Hisat2, I can do this if I run every pair individually but is there a way to get hisat to do them all with one code one after the other?
I'm relatively new to the ubuntu coding environment so feel free to dumb it down. Thanks in advance
The different files are different samples, right?
You can use for loops in bash, pretty useful. If you can share the naming pattern of your files (R1, R2) and what else we can get this done easily.
hey, thanks for your help. Yes, two files per sample one for the forward one for the reverse. Their naming is quite complicated,
sample 1 F: WTCHG_272965_701502_01 sample 1 R: WTCHG_272965_701502_02
sample 2 F: WTCHG_272965_701503_01 sample 2 R: WTCHG_272965_701503_02
etc
Alright, that's not too complicated :p
So WTCHG_272965_701502_01.fastq is a filename? Or WTCHG_272965_701502_01.fastq.gz?
yes sorry, WTCHG_272965_701502_01.fastq.gz is the file name for the forward read
If I didn't make a mistake the loop would look like this:
Perhaps you should test it first using a few echo statements:
okay, I've tried a number of things based on the above but can't seem to get anything to work properly.
I've created a new folder which contains the mm10 indexes (mm10idx -8 files) and 4 fastq.gz files which are the forward and reverse reads of 2 samples. Following the above naming pattern.
Changed my directory to that folder and run the script you've written but nothing seems to be happening...need some more help please? Maybe I've omitted something obvious. I know that if I was running it on one file I would type :
so can't see how what you've written relates to this or how it can be executed on loop. :/
I assume you know that
hisat2 with inputfiles ${f}_01.fastq.gz ${f}_02.fastq.gz and output ${f}.bam
was still something you had to fill in? That would then be: (based on your example for a single file)Did you check if the example with
echo
did show the expected input and output files?Yes did so but the echo comes back wrong as shown in my reply below, with fastq appearing at the end of the file names twice and the same file being used for forward and reverse reads. I've left your first line unchanged.
thank you for all your help!!! I will try it out and let you know how it goes!