Hi,
I think your information about number of pair mismatches (if you speak about your reads merged) is the consequences of your alignment then step 2 :)
You can simply got the information if you compare the number of reads aligned between merged reads and no merged reads.
I am looking for mismatches in base resolution, not read resolution. Pairs can be merged (or aligned concordantly) even if some overlapping bases disagree. I am looking for the number or rate of these mismatching bases.
Thank you Brian.
Can you clarify the tag trimq=xx (as opposed to qtrim...).
For example, suppose you have a low quality base in the middle of a read, is it considered an N or something else?
"qtrim" tells the program which end to trim, while "trimq" specifies the quality threshold. Only ends can be trimmed (and mainly the right end is the important one for trimming with respect to merging). "qtrim=r trimq=15" will trim the bases on the right end such that the region trimmed has an average quality below 15, while the remaining region has an average quality at least 15. For example, if the last 5 base qualities were "20, 0, 17, 19, 16", then the last 4 bases would be trimmed as that region has an average quality below 15 (not that 0 is an N) but the 20 would not be trimmed. It's a little hard to calculate by eye because the scores are first transformed to the probability scale (where 16 is roughly 2.5% chance of error) before being averaged. Low-quality bases are not considered N. Rather, if two reads mismatch at a location and one is higher quality than the other, the base with the higher quality is assumed correct and the resuting quality score is higher-lower.
Hi, I think your information about number of pair mismatches (if you speak about your reads merged) is the consequences of your alignment then step 2 :) You can simply got the information if you compare the number of reads aligned between merged reads and no merged reads.
I am looking for mismatches in base resolution, not read resolution. Pairs can be merged (or aligned concordantly) even if some overlapping bases disagree. I am looking for the number or rate of these mismatching bases.
Do you mean variant calling ?
You can call it variant calling where one mate declares different base than the other.