Hi,
I have a CLIP-Seq data and after mapping, I want to filter my sam files for reads containing deletion by using CIGAR & MAPQ >10. Is there a way in samtools to subset my sam file containing reads MAPQ>10 & with deletion?
Thank you very much!
Hi,
I have a CLIP-Seq data and after mapping, I want to filter my sam files for reads containing deletion by using CIGAR & MAPQ >10. Is there a way in samtools to subset my sam file containing reads MAPQ>10 & with deletion?
Thank you very much!
I would do something like:
samtools view file.bam | awk '($6 ~ /D/) && ($5>10)'
($6 ~ /D/)
means "deletion in CIGAR" and $5>10
means MAPQ>10
I really like awk oneliners for doing quick filters like this.
using samjdk: http://lindenb.github.io/jvarkit/SamJdk.html
java -jar dist/samjdk.jar -e 'return !record.getReadUnmappedFlag() && record.getMappingQuality()>10 && record.getCigar().getCigarElements().stream().map(C->C.getOperator()).anyMatch(O->O.equals(CigarOperator.D) || O.equals(CigarOperator.N)); ' input.bam
Pysam is a good tool to manipulate SAM/BAM with various attributes like CIGAR and mapping quality.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you very much!!! I'll try and see samjdk first and also pysam to see the results.