Hello, I am assigned to investigate the read the cover the reference more than 80%. I know that samtools can be used to calculate coverage and depth for the interested regions but what about each read? For example, I have read 1-10 that cover region A, I want to know their percent coverage for all of each read. Expect output: read 1 70% cover, leftmost position 1 (like sam format column) Is there any tools or script that can be used for this? Thanks in advance.
It seems like several concepts may be confused. "Reads" don't have a coverage, they're just one sequenced molecule. Instead, mapped reads can be characterized by their "length" and "mismatch rate". So are you trying to get the coverage of the sequence, get reads where >80% match the reference, get reads that cover 80% of the reference (read length), or something else entirely?
It seems like several concepts may be confused. "Reads" don't have a coverage, they're just one sequenced molecule. Instead, mapped reads can be characterized by their "length" and "mismatch rate". So are you trying to get the coverage of the sequence, get reads where >80% match the reference, get reads that cover 80% of the reference (read length), or something else entirely?
Yes, I want the reads that cover 80% of my interested region.