Hi all,
After stumbling on this post: https://bioinformatics.stackexchange.com/questions/16112/precisely-clipping-bam-file-to-bed-coordinates, I found out about the samtools ampliconclip package and wanted to use it for my own project. However, I'm running into some issues with using the samtools ampliconclip --soft-clip versus --hard-clip option.
Essentially, what I'm trying to do is clip certain reads of interest in a bam file down to a specific window that I specify using a bed file. However, when I run ampliconclip with --soft-clip, it gives me what I want, but when I run it with --hard-clip, it basically produces a blank bam file as output.
This is a picture of the unprocessed bam file of interest
This is the complement bed file that I want to clip
9 0 33261120
9 33262737 149000000000
After running the code
samtools ampliconclip --both-ends --soft-clip -b input.bed input.bam> output.bam
I get the result I want, which is all my reads clipped besides a certain genomic interval
However, what I really want is the hard clipped version of these reads. However, when running the same code
samtools ampliconclip --both-ends --hard-clip -b input.bed input.bam> output.bam
I basically get an empty file with no reads at all.
From my understanding, using ampliconclip with --hard-clip versus --soft-clip should only change the CIGAR string and not how the intersection with the bed file should work, so I'm a little stumped as to why its giving me such different outputs. Any help is greatly appreciated!
Did you check the actual bam file? It might be masked in the graphical representation
Hi, yes I checked the bam file and its significantly smaller than the soft clipped one, and only contains a couple reads.
There's a weird
return 0
on line 320, it might be a bug. Maybe you should open an issue.Issue opened in GitHub. I'll take a look at it.
Tagging: John Marshall