READ GROUP in GATK
1
0
Entering edit mode
15 months ago
Payal ▴ 160

My fastq files for a sample with their header line looked like this:

HHNG7DSX5_19417170_S118_L003_R1_001.fastq.gz
@A00428:335:HHNG7DSX5:3:1101:5466:1000 1:N:0:NGATGTTT+NTCAATTG

HHNG7DSX5_19417170_S118_L003_R2_001.fastq.gz
@A00428:335:HHNG7DSX5:3:1101:5466:1000 2:N:0:NGATGTTT+NTCAATTG

HHNG7DSX5_19417170_S118_L004_R1_001.fastq.gz
@A00428:335:HHNG7DSX5:4:1101:2302:1000 1:N:0:NGATGTTT+NTCAATTG

HHNG7DSX5_19417170_S118_L004_R2_001.fastq.gz
@A00428:335:HHNG7DSX5:4:1101:2302:1000 2:N:0:NGATGTTT+NTCAATTG

I merged L003_R1, L004_R1 and L003_R2, L004_R2. First question is should I merge R1 and R2 lanes?

I want to run bwa mem for GATK variant calling pipeline.

bwa mem -t 4 -B 4 -O 6 -E 1 -M -p -R "@RG\tID:HHNG7DSX5.3\tSM:${samplename}\tLB:${samplename}\tPL:ILLUMINA" /hg38_ref/genome.fa ${filepath}/$r1 ${filepath}/$r2 > ${filepath}/${samplename}.paired.sam

Is my RG correct? Or do I need to change something?

Any pointer or tutorial how to fill up RG will be really helpful.

Thanks,
Payal

Variant-calling GATK • 565 views
ADD COMMENT
1
Entering edit mode
15 months ago
Ram 44k

should I merge R1 and R2 lanes?

No. R1 and R2 are mate pairs, not lanes. You already concatenated lanes (the L00* denotes different lanes).

Is my RG correct? Or do I need to change something?

Your RG content is not really important as long as it is unique per read source. Your command looks fine, try running it and see what happens downstream. You can always AddOrReplaceReadGroups later if necessary.

ADD COMMENT

Login before adding your answer.

Traffic: 2534 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6