Question

Relative abundance of plasmid DNA content in each sample

0

Entering edit mode

8 months ago

adarsh_munna ▴ 60

Hi,

I have some E.coli WGS illumina sequenced samples, which contain both plasmid DNA and genomic DNA. I need to find the relative abundance of plasmid DNA content with respect to genomic DNA per sample.

How do I go about this? Will aligning the reads to a reference genome containing genomic and plasmid DNA, followed by counting the mapped reads work?

Please give some suggestions.

Thanks

Illumina E.coli WGS • 752 views

ADD COMMENT • link 8 months ago by adarsh_munna ▴ 60

0

Entering edit mode

If you know the size of the plasmid then you could indeed align to the genome and then measure depth of coverage using a tool such as pandepth (LINK) or mosdepth (LINK). This would give you a relative estimate of the proportion of the two entities. Inherant assumption is that no special cleanup/processing has been done that can alter the amounts of genomics/plasmid DNA during extractions/library prep/sequencing.

ADD REPLY • link 8 months ago by GenoMax 151k

score 0 · Answer 1 · 2024-08-14

Not sure what exactly you are trying to do, but plasmid abundance will heavily depend on its copy number. If you are trying to calculate that copy number, then this approach might give you some clues. Beyond that, high-copy plasmids will have more abundance, and low-copy plasmids less so.

Relative abundance in the population is not necessarily going to tell you how many cells have the plasmid. If you get 3x greater plasmid:gDNA ratio in one population, that could mean one of the two things that you won't be able to distinguish: 1) 3x more cells have the plasmid in one population vs. the other; 2) plasmid copy number is 3x greater in one population even though the same number of cells have the plasmid. Of course, there are always the in-between scenarios where part of the difference is explained by copy-number and the other part by actual difference in the number of cells that carry the plasmid.

score 0 · Answer 2 · 2024-08-15

Just adding my two cents here. I would say that WGS is not the optimal way of quantifying things because the amount of reads you get is highly dependent not only on genome abundances in your sample but also sequencing depth. By this, I mean that if you don't sequence deep enough you might overlook some DNA structures with low abundance or even if the ratio of E. coli to plasmid is too high, you might have troubles here. That being said I have some questions: Do you know the content of your sample? (Is there only this E. coli and it only carries a plasmid that you know? or is it an experiment in which you "fish" for plasmids using a certain E. coli for example?) If the second is the case, I would be careful since with short read sequencing you'll get fragments that match a plasmid but you won't be certain if it is just one or multiple. That being said, I think your best option is as you said, mapping against a reference. How you proceed from here is up to your experimental design. If you're absolutely sure nothing but plasmid and E. coli are there, you could map against a ref E. coli and those unmapped reads could represent your plasmid reads. Then you can estimate the proportion. Still, I would be careful and would definitely consider that you're always looking at an estimate and it could deviate from the truth.

As a safer alternative that doesn't have to do with bioinformatics, you could simply run a couple of qPCRs.