Question

Illumina PE - what are causes for coverage drops?

1

Entering edit mode

9.2 years ago

mschmid ▴ 180

I just assembled a plasmid using Illumina 2x250 PE. I am pretty confident that the assembly is fine.

I check by mapping the raw data to the assembly. In general I have a very high coverage (sometimes more than 1000x). Also I have very few unmapped pairs.

But I have some regions where the coverage drops to 10-20 fold coverage. This looks concerning, but I still think my assembly is fine because:

1. I this areas I also see barely unmapped pairs

2. I did not see more missmatches from the reads to the assembly sequence as in other regions

Now my questions:

1. What are the properties of sequences where Illumina gives a lower coverage? Any paper? Or a SW to check?

2. How else can I check if my assembly is fine in this low coverage areas?

illumina coverage paired end • 2.5k views

ADD COMMENT • link updated 9.2 years ago by thackl ★ 3.0k • written 9.2 years ago by mschmid ▴ 180

score 6 · Accepted Answer · 2015-09-28

6

Entering edit mode

9.2 years ago

thackl ★ 3.0k

Check GC content. Illumina library prep can have problems with both, regions of high as well as low GC

ADD COMMENT • link 9.2 years ago by thackl ★ 3.0k

0

Entering edit mode

Thanks thackl! So low/high GC content is mostly the problem? Any other factors?

ADD REPLY • link 9.2 years ago by mschmid ▴ 180

1

Entering edit mode

I'd say GC is the most common, particularly if you observe some local valleys in coverage.

Of course there are more sample specific issues - e.g. stuff, like sugars etc. that sticks to DNA an creates biases during DNA extraction, but that is probably not the case here.

One other thing could be structural heterogeneity. If you for example have two versions of a plasmid, one with a larger deletion and one without. If you assemble the longer variant and than map, you will get low coverages at the deletion region... But in this special case you would get sharp coverage drops with split mapped read at the ends etc...

ADD REPLY • link 9.2 years ago by thackl ★ 3.0k