Adding read groups to large BAM files
1
0
Entering edit mode
6.2 years ago
Fungsten • 0

How do you deal with large BAM files that miss read group or some parameter in the read group lines?

There has to be a better way to deal with this situations than using picard tools. In the case of 100-200G BAM files it takes ages to add or to modify the read group.

Also, is there any specific reason why some public BAM files, like from GiaB do not have all the fields in the read groups? Isn't that bad practice given the time that takes later on to pre-process the files?

bam alignment • 2.7k views
ADD COMMENT
0
Entering edit mode

Helllo,

you could try samtools addreplacerg instead. But I'm not sure whether this is faster. In the end it is neccessary to iterate over the whole file to add the read group to each read.

Isn't that bad practice

Sometimes I have the feeling that "bad practice" were follwed more often than "good practice" :(

fin swimmer

ADD REPLY
0
Entering edit mode
6.2 years ago
ATpoint 86k

I think since there is no formal definition of what exactly a read group is, it is not bad practice to not have it included in the BAM by default. Most tools do not even need it for downstream analysis. I personally use bamaddrg (I personally avoid Picard/Broad tools whenever possible because it is overly picky and sometimes difficult to understand the cryptic error messages). If your BAMs miss fields that a downstream tool needs, there is unfortunately no way around writing a new BAM file with the necessary fields.

ADD COMMENT

Login before adding your answer.

Traffic: 1321 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6