Question

Nanopore sequencing - plasmid assembly length confusion

0

Entering edit mode

6 weeks ago

abedkurdi10 ▴ 190

Hello all,

I have recently received raw FASTQ files for plasmids sequenced using Oxford Nanopore (long reads). The plasmid is around 6500bp of length.

First, I have run QC and found very long reads, much longer than the plasmid size.

Second, when I assembled the reads using Unicycler, I have obtained longer contigs than the plasmid length (which is expected due to the very long reads available).

Third, I have performed pairwise sequence alignment using Clustal Omega and found overlap between the contigs. Kindly check the following link that shows the alignments: https://github.com/abedkurdi/testing_shiny_app/blob/master/clustalo-I20250220-090531-0648-32923810-p1m.aln-clustal_num

I appreciate any guidance in this matter.

Thank you.

assembly long nanopore reads • 799 views

ADD COMMENT • link 5 weeks ago by abedkurdi10 ▴ 190

score 1 · Answer 1 · 2025-02-20

1

Entering edit mode

6 weeks ago

colindaven 7.3k

You could have a look at this tool to annotate your plasmid contigs - https://github.com/mmcguffi/pLannotate

Try blasting the contigs to look for contaminating sequences. Or use kraken/centrifuge to check.

A simple ORF finding tool might help you find transcript sequences which make blasting and contamination checks easier than using long contigs.

ADD COMMENT • link 6 weeks ago by colindaven 7.3k

0

Entering edit mode

I forgot to mention that. I have already blasted the contigs and I am getting the top hits for "Pseudomonas Putida" for all the contigs and in all the samples. Do you think that this could be a contamination that is messing with the data?

Also, I got another batch of samples for other group of people, I also did blast and I am getting Mycoplasma as top hits.

ADD REPLY • link 6 weeks ago by abedkurdi10 ▴ 190

1

Entering edit mode

At least the Mycoplasma sounds very much like contamination of the sample to me. Talk to the people who did the sequencing for you.

It could be you still have your plasmid contigs/genes among the contamination - search for them with blast etc

ADD REPLY • link 6 weeks ago by colindaven 7.3k

1

Entering edit mode

If your data has contamination with non-plasmid DNA then you would need to account for it, potentially prior to assembly. You should be able to bin the non-plasmid reads out.

ADD REPLY • link 6 weeks ago by GenoMax 150k

0

Entering edit mode

I have run pLannotate on one of the samples and on the reference plasmid sequence. I noticed that by comparing the outputs, the features in the sample have multiple copies, while in the reference I have one copy per feature. Is that weird? Is it possible that I have "concatemers"?

ADD REPLY • link 6 weeks ago by abedkurdi10 ▴ 190

0

Entering edit mode

Was the plasmid isolated before library prep? How was the lib prep done?

ADD REPLY • link 6 weeks ago by GenoMax 150k

0

Entering edit mode

The plasmid was extracted using maxiprep kit from Qiagen. Regarding the library preparation, rapid barcoding kit from ONT (SQK.RBK.114.24) was used.

ADD REPLY • link 6 weeks ago by abedkurdi10 ▴ 190

0

Entering edit mode

Any insights?

ADD REPLY • link 5 weeks ago by abedkurdi10 ▴ 190

0

Entering edit mode

Do you expect the prep to be pure plasmid DNA?

ADD REPLY • link 5 weeks ago by GenoMax 150k

0

Entering edit mode

It should be, right?

ADD REPLY • link 5 weeks ago by abedkurdi10 ▴ 190