Entering edit mode
4.1 years ago
MSRS
▴
590
Hi Biostars Community,
I have spades assembly contigs of E. coli complete genome. From plasmidfinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/) we found several plasmids are present in these contigs with around 150-650 bp.
Plasmid Identity Query / Template length Contig Position in contig Note Accession number
Col(BS512) 100 233 / 233 contigs113 1956..2188 NC010656
IncFIA 100 388 / 388 contigs98 13867..14254 AP001918
IncFIB(pB171) 99.22 643 / 643 contigs99 1765..2407 AB024946
IncFII 98.08 261 / 261 contigs96 3490..3750 AY458016
IncI(Gamma) 100 137 / 141 contigs97 24329..24465 AP011954
Is there any way to separate those plasmids from contigs/fastq files? Thank you.
Not easily no. Removing named sequences is easy (check the forum for lots of answers), however unambiguously identifying a plasmid is an unsolved problem.
You can use plasmid finder tools and then separate the fasta's by name, but it's never going to be 100% effective. 150bp isn't so much a contig, as a read. They're all so short they're likely just junk. You aren't going to get a useful plasmid assembly out of those no matter what you do with them.