Hi,
Is there a way to retrieve the part of reads that has been hard clipped and discarded in bam files?
Thanks!
Hi,
Is there a way to retrieve the part of reads that has been hard clipped and discarded in bam files?
Thanks!
Hard clipping removes the read base and quality information from that record in the BAM file. It must be recovered from another source (e.g. the original FASTQ file) as the information is not present in the BAM file.
Exception: if the BAM file contains multiple record for a single read, the hard clipped record can be rehydrated from the other record. For example, if your hard clips are due to split read alignments generated by bwa, the primary read alignment mapping location record will be soft clipped and will containing the required information. The hard clipping on the supplementary alignments can be converted to soft clips using the information from the primary read alignment.
My SV caller GRIDSS includes a (somewhat poorly named) utility program gridss.ComputeSamTags
that, using the SOFTEN_HARD_CLIPS
parameter, can do exactly that (you'll need to sort by read name first).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
If you have the original fastq files you could fish them out of there. I assume you probably don't have them and hence the question?
Exactly. I don't have the fastq files and only have bam file.
Then you have answered your own question. If they are not in the BAM then there is no way to get them back.