Hello all,
I have aligned my samples against the mitochondrion genome of the species I work with. My idea was that after this I would keep the unmapped ones (which would be the nuclear reads), and then align these against the reference genome.
However, and I did not think of this, the result of said process are bam files, and I have no idea how or if I can use bam files for a re-alignment.
Here is the code I have used for the mapping against the mtDNA:
#!/bin/bash
REFERENCE=../refs/mtDNA/mitochondrion-genome.fa
REFNAME=mitochondrion-genome
FORWARD=_1.fq.gz
REVERSE=_2.fq.gz
for i in ../trim/1stBatch-fastp/*_1.fq.gz
do
SAMPLE=$(echo ${i} | sed "s/_1\.fq\.gz//" | rev | cut -d'/' -f-1 | rev)
echo ${SAMPLE}_1.fq.gz ${SAMPLE}_2.fq.gz
## Define reference base name
REFBASENAME="${REFERENCE%.*}"
sbatch --account ACCOUNT --mem=64g -c 8 -o bowtie2-mtDNA.out -e bowtie2-mtDNA.err \
--wrap="bowtie2 -q --phred33 --very-sensitive -p 8 -I 0 -X 1500 -x ${REFBASENAME} \
-1 ../trim/1stBatch-fastp/${SAMPLE}${FORWARD} -2 ../trim/1stBatch-fastp/${SAMPLE}${REVERSE} -S ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'.sam'
samtools view -bS ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'.sam' > ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'.bam'
rm -f ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'.sam'
samtools view -h -q 20 ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'.bam' | samtools view -buS - | samtools sort -o ../bam/1stBatch/mtDNA/${SAMPLE}'_bt2_'${REFNAME}'_minq20_sorted.bam'"
done
I was thinking that, as alternative, I could append the mtDNA to the nuclear genome and then use samtools to remove the reads aligned to the mtDNA. What do you think is a best strategy?
Thank you!
Great answer, thank you so much.