What is a good tool that will take a bam file and a bed file describing regions where all aligned bases of aligned reads should be soft-clipped, but leave the aligned bases outside the regions intact and produce an output bam as a result?
What is a good tool that will take a bam file and a bed file describing regions where all aligned bases of aligned reads should be soft-clipped, but leave the aligned bases outside the regions intact and produce an output bam as a result?
I wrote PcrClipReads for a previous question on biostars
I'm not sure if it's really what you need...
update 2021: there is now samtools ampliconclip
samtools-ampliconclip http://www.htslib.org/doc/samtools-ampliconclip.html
Clip reads in a SAM compatible file based on data from a BED file.
Home made solution, not tested:
## Clip reads overlapping target regions (note that -L option is slow as it doesn't use the index)
samtools view -F4 -L clippable.bed -h aln.bam | awk '{if($0 ~ "^@"){print $0} else {len=length($10); $6=len"S"; print $0}}' | samtools view -u - > clipped.tmp.bam
## Get reads *not* overlapping clippable regions (input should be sorted)
bedtools intersect -sorted -ubam -v -a aln.bam -b clippable.bed > unclipped.tmp.bam
## Put them back together
samtools merge aln2.bam clipped.tmp.bam unclipped.tmp.bam
You should decide how to handle reads partially overlapping the clippable regions and make sure these reads are not present in output twice.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Brilliant! Bang on the money! Thanks Pierre!
I want to be the first author of whatever you'll write :-)