How do you deal with large BAM files that miss read group or some parameter in the read group lines?
There has to be a better way to deal with this situations than using picard tools. In the case of 100-200G BAM files it takes ages to add or to modify the read group.
Also, is there any specific reason why some public BAM files, like from GiaB do not have all the fields in the read groups? Isn't that bad practice given the time that takes later on to pre-process the files?
Helllo,
you could try
samtools addreplacerg
instead. But I'm not sure whether this is faster. In the end it is neccessary to iterate over the whole file to add the read group to each read.Sometimes I have the feeling that "bad practice" were follwed more often than "good practice" :(
fin swimmer