My fastq files for a sample with their header line looked like this:
HHNG7DSX5_19417170_S118_L003_R1_001.fastq.gz
@A00428:335:HHNG7DSX5:3:1101:5466:1000 1:N:0:NGATGTTT+NTCAATTG
HHNG7DSX5_19417170_S118_L003_R2_001.fastq.gz
@A00428:335:HHNG7DSX5:3:1101:5466:1000 2:N:0:NGATGTTT+NTCAATTG
HHNG7DSX5_19417170_S118_L004_R1_001.fastq.gz
@A00428:335:HHNG7DSX5:4:1101:2302:1000 1:N:0:NGATGTTT+NTCAATTG
HHNG7DSX5_19417170_S118_L004_R2_001.fastq.gz
@A00428:335:HHNG7DSX5:4:1101:2302:1000 2:N:0:NGATGTTT+NTCAATTG
I merged L003_R1
, L004_R1
and L003_R2
, L004_R2
. First question is should I merge R1 and R2 lanes?
I want to run bwa mem for GATK variant calling pipeline.
bwa mem -t 4 -B 4 -O 6 -E 1 -M -p -R "@RG\tID:HHNG7DSX5.3\tSM:${samplename}\tLB:${samplename}\tPL:ILLUMINA" /hg38_ref/genome.fa ${filepath}/$r1 ${filepath}/$r2 > ${filepath}/${samplename}.paired.sam
Is my RG correct? Or do I need to change something?
Any pointer or tutorial how to fill up RG will be really helpful.
Thanks,
Payal