I have tons of folders inside which I have fastq files, and these are single end reads.
I want to write a shell/bash script to automate the alignment.
for i in /media/usr/Elements/name/Extracted_SB_reads/C1DNA.extrSB/*.fq
do
bowtie2 --sensitive-local -p8 ${i} -x /media/usr/Elements/usr/bowtie2_index/hg19/ -U ${i}.fq -S $i.sam
done
but when I run the script it says:
(ERR): bowtie2-align exited with value 1
Extra parameter(s) specified: "/media/usr/Elements/usr/Extracted_SB_reads/C1DNA.extrSB/V300054326_L4_B5GHUMvudRAABGAAA-556.SB.fq"
Error: Encountered internal Bowtie 2 exception (#1)
Command: /home/usr/miniconda3/bin/bowtie2-align-s --wrapper basic-0 --sensitive-local -p8 -x /media/usr/Elements/usr/bowtie2_index/hg19/ -S /media/usr/Elements/usr/Extracted_SB_reads/C1DNA.extrSB/V300054326_L4_B5GHUMvudRAABGAAA-556.SB.fq.sam -U /media/usr/Elements/usr/Extracted_SB_reads/C1DNA.extrSB/V300054326_L4_B5GHUMvudRAABGAAA-556.SB.fq.fq /media/usr/Elements/usr/Extracted_SB_reads/C1DNA.extrSB/V300054326_L4_B5GHUMvudRAABGAAA-556.SB.fq
(ERR): bowtie2-align exited with value 1
Do you have your bowtie2 indices for all the reads files already? Or do you want to map them against the same index? Could you show us the output of ls on the directory where your reads are contained?
These are reads extracted for a transposon and guideRNA. They have to be aligned to the human genome to determine the genomic location of their insertion.
I haven't tested this, but something like this would probably work:
#!/bin/bash -ex
MYPATH="/media/amit/Elements/usr/Extracted_SB_reads";
find ${MYPATH} -maxdepth 3 -type f -name "*.fq$" > ${MYPATH}/fq_names.txt;
while read INFILE;
do
bowtie2 -p 8 -x /media/usr/Elements/usr/bowtie2_index/hg19/BT2INDEXNAME -U ${INFILE} -S ${INFILE%.fq}.sam;
done < ${MYPATH}/fq_names.txt;
Do note the /media/usr/Elements/usr/bowtie2_index/hg19/BT2INDEXNAME in there indicating that you need to replace BT2INDEXNAME with the actual name that's sitting in /media/usr/Elements/usr/bowtie2_index/hg19/. If there are files called myhumangenomebt2.1.bt2 and myhumangenome.fq.2.bt2 in /media/usr/Elements/usr/bowtie2_index/hg19/, for example, then /media/usr/Elements/usr/bowtie2_index/hg19/BT2INDEXNAME should be replaced with /media/usr/Elements/usr/bowtie2_index/hg19/myhumangenome.fq.
Are you really going to run all of this on a laptop though? It'll probably take for ever, and submitting so many tasks like this through a for loop isn't exactly efficient.
Note: the post has been edited to incorporate corrections discussed in the comments below.
Hi,
Thanks for your help. These fastq files are not very big, as only those reads which have the transposon and the guide RNA have been extracted (wgs).
I tried running it script though, and it gave an error:
(base) usr@usr-X705UDR:/media/usr/Elements/usr$ ./bowtie.sh
+ MYPATH=/media/usr/Elements/usr/Extracted_SB_read
+ find /media/usr/Elements/usr/Extracted_SB_read -maxdepth 3 -type -f name '*.fq$'
./bowtie.sh: line 4: /media/usr/Elements/usr/Extracted_SB_read/fq_names.txt: No such file or directory
Ah, looks like there's a typo there, could you please edit MYPATH=MYPATH=/media/usr/Elements/usr/Extracted_SB_read to MYPATH=MYPATH=/media/usr/Elements/usr/Extracted_SB_reads? Didn't catch the missing s at the end of read there.
Ah super sorry I didn't catch that one. Wrote this on the fly, and I didn't test it, so I had no idea where the errors were. Good on you for finding it!!
Do you have your
bowtie2
indices for all the reads files already? Or do you want to map them against the same index? Could you show us the output ofls
on the directory where your reads are contained?These are reads extracted for a transposon and guideRNA. They have to be aligned to the human genome to determine the genomic location of their insertion.