Hi all,
I am trying to do a couple of operations in automatic. Basically I have a directory of fastq PE files from the same sample e.g.:
L16-24MG-A_S14_L001_R1_001.fastq.gz L16-24MG-A_S14_L003_R1_001.fastq.gz
L16-24MG-A_S14_L002_R1_001.fastq.gz L16-24MG-A_S14_L004_R1_001.fastq.gz
L16-24MG-A_S14_L001_R2_001.fastq.gz L16-24MG-A_S14_L003_R2_001.fastq.gz
L16-24MG-A_S14_L002_R2_001.fastq.gz L16-24MG-A_S14_L004_R2_001.fastq.gz
I am trying to cat the fastq file together and then run bowtie so at the and having only one .bam file. This is my script so far but it's not quite working. In fact I am not even able to obtain a combined fastq file before bowtie. Can you help me out please?
for i in $(ls *R1*.gz) do cat *R1* > ${i%.R1_combined.fastq}.gz done
for i in $(ls *R2*.gz) do cat *R2* > ${i%.R2_combined.fastq}.gz done
gunzip *.gz
for i in $(ls *.fastq | rev | cut -c 13- | rev | uniq)
do
bowtie /home/casaburi/ufrc/hybrid_pacbio_global/rsem/final_assembly_cdhit100 \
-1 ${i}_R1_combined.fastq -2 ${i}_R2_combined.fastq \
--all --best --strata -m 300 --chunkmbs 512 -S -p 10 | samtools view -F 4 -S -b -o ${i}.bam
done
I have formatted your code correctly. In future use the icon shown below (after highlighting the text you want to format as code) when editing.
You are not using
;
to terminate your shell script statements for one.Thanks genomax, and sorry for the missing format. I am still having the issue of not being able to see a concatenated .gz file. I rather have this:
Which is not what I am looking for. I am looking to have only this:
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.Your original files will not disappear since you are
cat
ing them to make the large file. So you should expect the combined file in addition to the originals.You could align the four pieces in parallel and then merge the BAM files afterwards.
I know that the original files will still be there, but the combined file (which is the all point of this post) is not appearing at this stage.
A simple
cat L16-24MG-A_S14*R1* > L16-24MG-A_S14.combined.fastq.gz
should be sufficient for that purpose (if that is all the files you have).Right, but I have multiple files in different folders. So I was planning to just run the same script in every folder. That's why I was looking for something that could also write the output as i% based on the input name, otherwise I have to manually edit every time the script according to input.
No one can answer this?