Question

Input arguments for AddOrReplaceReadGroups (Picard) from fastq

1

Entering edit mode

3.5 years ago

shpak.max ▴ 50

I'm trying to use the Picard function AddOrReplaceReadGroups to format headers of bam files for GATK.

I'm unclear what I need to use as the RGLB and RGPU input arguments (read group library and platform unit). I know the latter is the sequence barcode. My question is how do I determine RGLB and RGPU from the fastq file.

The fastq files I obtained from NCBI SAR have the following header lines, from which I don't seem able to determine read group library or barcode, e.g.

@SRR8439151.1.1 1 length=150
NGCTGAGGTAATAATTACACACAACACATCGGCAGTATGCTCAAAAGCTGTTTAGGCAAAATTATACGAATTTGCATATT
CAATTGAACCGAACACATAGGCTCGGCAATGAATAACGCATGGATGAGCTTATTTCTGCAATTAAAAGTT
+SRR8439151.1.1 1 length=150
#AAAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFFJJJFJJF-<F--<F-FFJFJFFJJA7F-JJJJJAAJJJFA-AJ
JJAJFJFJJJFFJJJJFFA7JJ7-7AJ<<AJAFJJJFFFJ-AJ<AFFJJAF-77A<<FFFA-A<-A7<--
@SRR8439151.1.2 2 length=150

Picard GATK • 4.4k views

ADD COMMENT • link updated 3.5 years ago by GenoMax 148k • written 3.5 years ago by shpak.max ▴ 50

score 3 · Answer 1 · 2021-07-08

3

Entering edit mode

3.5 years ago

GenoMax 148k

You will find this page useful for the question at hand. You may end up having to make strings up for some of these if the relevant information is not available.

ADD COMMENT • link 3.5 years ago by GenoMax 148k

0

Entering edit mode

Will GATK (specifically IndelRealigner) successfully run if I just provide "placeholder" strings for RGLB and RBPU?

ADD REPLY • link 3.5 years ago by shpak.max ▴ 50

2

Entering edit mode

Yes, the strings themselves don't matter as long as they are correctly differentiated among the data, ie don't use the same RGLB placeholder for data that came from different libraries, etc.

Note however that local realignment around indels is no longer necessary if you're going to use HaplotypeCaller or Mutect2 to do your variant calling.

You can get more info and answers about GATK-specific questions from the GATK team themselves on their support forum: https://gatk.broadinstitute.org/hc/en-us/community/topics

ADD REPLY • link 3.5 years ago by vdauwera ★ 1.2k