Question

How to find mapped regions in the reference genome

0

Entering edit mode

4.4 years ago

Ashi ▴ 20

Hi,

I am very new to NGS analysis.

I have used BWA for mapping. My reference genome is a set of simulated long reads of E.Coli (I have used Badreads tool for this) and I have mapped short reads to this. Now I want to know for each long read which portions have been mapped. My ultimate goal is to find a distribution of the percentage mapping of the long reads. For example if the long read is 10k length and i know how much 3k length (characters of the long read) has been mapped, then I can say this long read is (10k/3k) covered by the short reads.

Is there any way to do this?

Thank you.

alignment sequencing genome LongReads BWA • 1.3k views

ADD COMMENT • link 4.4 years ago by Ashi ▴ 20

score 0 · Answer 1 · 2020-06-30

0

Entering edit mode

4.4 years ago

GenoMax 147k

If you have already done the mapping then calculate the coverage based on the answers here: Tools To Calculate Average Coverage For A Bam File?

ADD COMMENT • link 4.4 years ago by GenoMax 147k

0

Entering edit mode

Hi @genomax thank you for the help. Using the samtools and awk mentioned in the link that you gave me I got an output like this: Average = 1.08026

What I want is the coverage for each long read that I gave as the reference for mapping in the BWA. Is there a way to get that?

ADD REPLY • link 4.4 years ago by Ashi ▴ 20

0

Entering edit mode

Take a look at samtools idxstats in that case.

ADD REPLY • link 4.4 years ago by GenoMax 147k

0

Entering edit mode

Yes thank you. That works. :)

ADD REPLY • link 4.4 years ago by Ashi ▴ 20