Entering edit mode
4.6 years ago
Researcher
▴
20
I am doing sequence analysis on the genomic data of E. Coli. My sequencing comes from Illumina sequencing. As I don't have the detailed information, I set the arguments of AddOrReplaceReadGroups method in Picard tools as follows:
RGID=SampleName RGLB=SampleName RGPU=illumina RGPL=illumina RGLB=SampleName RGSM=SampleName
Is it fine or should I look for the exact value for these fields?
This was the first sentence of the tutorial that @Pierre had linked on the last question you had asked about this topic. Are you trying to use GATK for bacteria?
Dear @genomax
yeah, I am using GATK, but, I skipped some steps like "Base quality score recalibration (BQSR)" for bacteria. So, now my question is for which steps of GATK, the read group information is needed?
Any answer from @genomax , @ATpoint or anyone else?
Any answer from @genomax , @ATpoint or anyone else?
I am not a regular GATK user but all GATK/Picard tools likely require RG.
That said snippy was a simpler tool recommended today for bacterial SNP calling.
Do you plan to use any tools that rely on exact information for this like some of the Broad Institute/Picard/GATK tools family? if not then just leave it as is, I personally never care about the inconsistently-defined concept of RGs.
Dear @ATpoint
C: Read groups information?