Bacterial genome analysis - how the check if newly sequenced genome has errors
1
0
Entering edit mode
10 months ago
Veselina • 0

Hello! I am quite new to the field of genomics, so I apologize if my question is inadequate. I am a molecular biologist and have recently started as a research student, analyzing bacterial genomes. The genome of the bacteria I work with has been sequenced and is available in GenBank as the reference genome. However, this sequencing was done more than 10 years ago using Sanger, and there are numerous gaps in the data.

Before I joined, my lab's team re-sequenced the genome using NGS long reads (PacBio) and short reads (Illumina). We have the newly assembled genome based on the long reads, while the Illumina data is available only in raw format. Additionally, we sequenced the genome of a mutant of the same bacteria. Upon comparing the mutant's genome with the newly assembled sequenced genome, we found a few mismatches (suggested mutations). Interestingly, when comparing these mismatches to the reference genome, they appear to be matches. Consequently, we are unable to determine if the mismatches in the mutant arise from mutations in the genome or errors in the newly sequenced genome.

My question is: how can I verify if the mismatches are mutations or just errors in the newly assembled genome? Can I use the raw short reads data to check, and if so, how?

Thank you so much!

genome next-gen alignment • 684 views
ADD COMMENT
1
Entering edit mode
10 months ago
GenoMax 147k

how can I verify if the mismatches are mutations or just errors in the newly assembled genome?

You may want/need to use an independent method for absolute verification (that is what is currently done with patient samples). You can use sanger sequencing of the amplified regions where the mutation is/are.

You could also align the Illumina reads to the assembled genome and check for potential errors. Any errors should be easily apparent since error rate of Illumina reads is going to be much lower and you should see reads with similar sequence piling up nicely providing depth.

ADD COMMENT
0
Entering edit mode

Thank you, GenoMax! Could you please suggest best tools to use to align the Illumina reads to the newly assembled genome? Thanks again!

ADD REPLY
1
Entering edit mode

I think bwa-mem2 or bowtie2 should do the trick :)

ADD REPLY
0
Entering edit mode

I will do that! Thank you a lot!

ADD REPLY

Login before adding your answer.

Traffic: 2560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6