Hi,
I am very new to NGS analysis.
I have used BWA for mapping. My reference genome is a set of simulated long reads of E.Coli (I have used Badreads tool for this) and I have mapped short reads to this. Now I want to know for each long read which portions have been mapped. My ultimate goal is to find a distribution of the percentage mapping of the long reads. For example if the long read is 10k length and i know how much 3k length (characters of the long read) has been mapped, then I can say this long read is (10k/3k) covered by the short reads.
Is there any way to do this?
Thank you.
Hi @genomax thank you for the help. Using the samtools and awk mentioned in the link that you gave me I got an output like this: Average = 1.08026
What I want is the coverage for each long read that I gave as the reference for mapping in the BWA. Is there a way to get that?
Take a look at
samtools idxstats
in that case.Yes thank you. That works. :)