Hi, I have used default bowtie option which allows for three mismatches to get bam file aligned with my reference sequence. I want to know how many reads are aligned with terminal mismatches (i.e,last three, last two and last one base mismatched at 3' region of the aligned reads at their loci position). Additionally, I want to account for both positive and negative strands with terminal mismatches. How can I get this information using the Rsamtools? How do I use CIGAR andwhat CIGAR string I should use for this purpose? Thanks for your help in advance.
here is my code:
library(Rsamtools)
library(GenomicRanges)
library(GenomicAlignments)
bam.file <-"aligned.bam"
params <-
ScanBamParam(
which = which,
flag = scanBamFlag(
isUnmappedQuery = FALSE,
isDuplicate = FALSE,
isNotPassingQualityControls = FALSE
),
simpleCigar = FALSE,
reverseComplement = FALSE,
what = c(
"qname",
"flag",
"rname",
"seq",
"strand",
"pos",
"qwidth",
"cigar",
"qual",
"mapq"
)
)
that is, soft or hard clip isn't it ?
I believe is soft clip.