I need help in getting read group info for performing alignment using BWA-MEM2. I read previous post (bwa mem: Passing a variable to read group) on read-group info, where a shell script is used to get the read group info from fastq file. Can someone explain what details should be given in the shell script, it would be of great help?
Since you are interested in running bwa-mem2 you will need to make the necessary changes inside the script to replace the command but otherwise you can use the answer bwa mem: Passing a variable to read group to run the script as shown. bwa-mapper.sh read_1.fq.gz read_2.fq.gz. Your read headers will need to follow the standard illumina format.
A=( $(ls $1/*1.fastq.qz && ls $1/*1.fq.qz) ) #collect all forward fastq files
for i in "${!A[@]}";
do
header=$(zcat ${A[i]} | head -n 1)
id=$(echo $header | head -n 1 | cut -f 1 -d":" | sed 's/@//'
echo "@RG\tID:$id"
I have paired end sequences for 6 subjects. For each subject, read group information should be added in the bam file?. Read group info is different from subject to subject, right? If so, why combine all the forward fastq files as given in the above code. I am trying to understand the GATK pipeline for NGS analysis. Sorry for the silly question
Making a vague reference to a previous post does not help you or us. Please provide a link for that post.
Sorry, have given the link above.
Since you are interested in running
bwa-mem2
you will need to make the necessary changes inside the script to replace the command but otherwise you can use the answer bwa mem: Passing a variable to read group to run the script as shown.bwa-mapper.sh read_1.fq.gz read_2.fq.gz
. Your read headers will need to follow the standard illumina format.Thread continues: Read group info