I created a .bam file without read groups in the header, and am trying to correct this by using picard tools' AddOrReplaceReadGroups.
I've looked over the documention for AORRG,
http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups
but I'm not sure where I obtain several of the input arguments to run this. Specifically, given a .bam file (and source fq files), how do I find:
RGLB (read group library), RGID (read group ID), RGPU (run barcode), etc? Of the arguments, the only one that's straightforward to me is RGPL (platform, i.e. illumina, solid, etc)
Thanks - where does one obtain the run barcode? Is it something that I can get from the .bamfile header?
If a barcode was used, it'll often be in the fastq files, after the read names (e.g., @name 1:Y:barcode).
I had looked at that, but the barcodes for each read aren't unique. I assumed that somewhere there is a unique barcode for the sample as well. In any case, I attempted to run addorreplaacereadgroups with RGPU=NONE, it gave an output .bam file, but it seems to be missing some identifiers in the header.
Hey, about the differing barcodes: there must be a barcode that outnumber the rest for many magnitude orders, that is the cannon; the rest should differ only in one base.