Nanopore sequencing - plasmid assembly length confusion
1
0
Entering edit mode
6 weeks ago
abedkurdi10 ▴ 190

Hello all,

I have recently received raw FASTQ files for plasmids sequenced using Oxford Nanopore (long reads). The plasmid is around 6500bp of length.

First, I have run QC and found very long reads, much longer than the plasmid size.

Second, when I assembled the reads using Unicycler, I have obtained longer contigs than the plasmid length (which is expected due to the very long reads available).

Third, I have performed pairwise sequence alignment using Clustal Omega and found overlap between the contigs. Kindly check the following link that shows the alignments: https://github.com/abedkurdi/testing_shiny_app/blob/master/clustalo-I20250220-090531-0648-32923810-p1m.aln-clustal_num

I appreciate any guidance in this matter.

Thank you.

assembly long nanopore reads • 799 views
ADD COMMENT
1
Entering edit mode
6 weeks ago

You could have a look at this tool to annotate your plasmid contigs - https://github.com/mmcguffi/pLannotate

Try blasting the contigs to look for contaminating sequences. Or use kraken/centrifuge to check.

A simple ORF finding tool might help you find transcript sequences which make blasting and contamination checks easier than using long contigs.

ADD COMMENT
0
Entering edit mode

I forgot to mention that. I have already blasted the contigs and I am getting the top hits for "Pseudomonas Putida" for all the contigs and in all the samples. Do you think that this could be a contamination that is messing with the data?

Also, I got another batch of samples for other group of people, I also did blast and I am getting Mycoplasma as top hits.

ADD REPLY
1
Entering edit mode

At least the Mycoplasma sounds very much like contamination of the sample to me. Talk to the people who did the sequencing for you.

It could be you still have your plasmid contigs/genes among the contamination - search for them with blast etc

ADD REPLY
1
Entering edit mode

If your data has contamination with non-plasmid DNA then you would need to account for it, potentially prior to assembly. You should be able to bin the non-plasmid reads out.

ADD REPLY
0
Entering edit mode

I have run pLannotate on one of the samples and on the reference plasmid sequence. I noticed that by comparing the outputs, the features in the sample have multiple copies, while in the reference I have one copy per feature. Is that weird? Is it possible that I have "concatemers"?

ADD REPLY
0
Entering edit mode

Was the plasmid isolated before library prep? How was the lib prep done?

ADD REPLY
0
Entering edit mode

The plasmid was extracted using maxiprep kit from Qiagen. Regarding the library preparation, rapid barcoding kit from ONT (SQK.RBK.114.24) was used.

ADD REPLY
0
Entering edit mode

Any insights?

ADD REPLY
0
Entering edit mode

Do you expect the prep to be pure plasmid DNA?

ADD REPLY
0
Entering edit mode

It should be, right?

ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6