Entering edit mode
3 months ago
hpapoli
▴
150
Hello,
There are many posts about building the read group information but I haven't found a complete one yet. For a sample called, Sample_A
, with a fastq header from sample_A.R1.fastq.gz
, as follows:
@A00181:639:HNTFMDSX5:2:1101:1018:1000 1:N:0:ACACTAAG+TTATGGAT
Is this ReadGroup correct?
@RG\\tID:HNTFMDSX5.2\\tPL:ILLUMINA\\tLB:Sample_A\\tSM:Sample_A\\tPU:HNTFMDSX5.2.SampleA
HNTFMDSX5: Flowcell ID
2: Flowcell lane
I am mostly uncertain about the value for LB
and the value for the third field of PU (PU= {FLOWCELL_BARCODE}.{LANE}.{SAMPLE_BARCODE})
Thanks so much for your help!
thanks very much! What about the ID and PU field? Is the PU field correct as it is written now? Thanks again!
https://gatk.broadinstitute.org/hc/en-us/articles/360035890671-Read-groups should explain how to use read groups. This is a GATK requirement. Probably not needed for most other software.