Plasmid sequences seperation from assembly contigs/fastq file of bacterial complete genome?
0
0
Entering edit mode
4.1 years ago
MSRS ▴ 590

Hi Biostars Community,

I have spades assembly contigs of E. coli complete genome. From plasmidfinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/) we found several plasmids are present in these contigs with around 150-650 bp.

Plasmid Identity    Query / Template length Contig  Position in contig  Note    Accession number
Col(BS512)  100 233 / 233   contigs113  1956..2188      NC010656

IncFIA  100 388 / 388   contigs98   13867..14254        AP001918

IncFIB(pB171)   99.22   643 / 643   contigs99   1765..2407      AB024946

IncFII  98.08   261 / 261   contigs96   3490..3750      AY458016

IncI(Gamma) 100 137 / 141   contigs97   24329..24465        AP011954

Is there any way to separate those plasmids from contigs/fastq files? Thank you.

Assembly plasmid • 964 views
ADD COMMENT
0
Entering edit mode

Not easily no. Removing named sequences is easy (check the forum for lots of answers), however unambiguously identifying a plasmid is an unsolved problem.

You can use plasmid finder tools and then separate the fasta's by name, but it's never going to be 100% effective. 150bp isn't so much a contig, as a read. They're all so short they're likely just junk. You aren't going to get a useful plasmid assembly out of those no matter what you do with them.

ADD REPLY

Login before adding your answer.

Traffic: 2632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6