Entering edit mode
13 months ago
mgranada3
▴
60
I am using samtools to add the read group to my DNA-seq files. I know that in my code I added my input .bam and an output as -o.
What I am confused about is, does samtools output a completely new .bam file that I will use during subsequent steps?
#!/bin/bash
#SBATCH -J RG_test_sample1
#SBATCH -A gts-rros3
#SBATCH -N 3 --ntasks-per-node=24
#SBATCH --mem-per-cpu=8G
#SBATCH -t 24:00:00
#SBATCH -o Report-%j.out
cd $SLURM_SUBMIT_DIR
ml samtools
samtools addreplacerg -r ID:S7 -r LB:L1 -r SM:MG04_1_100_S7 -o sample1_rg.bam sample1.bam
The next step is for me to sort my .bam file. In this case, will I use the new "sample1_rg.bam" file?
yes
using
addreplacerg
is quite rare. For example, if your upstream process uses bwa, you should have a look at the option-R
(set the read groups).So I realized I needed a read group after bwa-mem2 produced my . sam files. I then read that samtools could add a readgroup after converting them to .bam files. I tried using Picard but didn't have any luck so I am using the samtools read group option since I have already successfully ran samtools before. Will this be an issue downstream?
no, you just consume more I/O, time and space. But using addreplacerg is ok.