Question

What is the general philosophy of softwares for searching mutations ?

0

Entering edit mode

3.5 years ago

Student ▴ 30

Hello !

I was looking for a software that finds mutations and/or variants in bacterial genomes. Could you please give me some suggestions about which is the best one and how do they generally work ?

Before posting this question, I saw something on Google and also here in this community I found this post Identifying Non Synonymous mutations in bacterial genomes, in which they mention for example FreeBayes . I saw that this software (together to many others) is based on the alignment of reads of the query sequence with the reference sequence.

However, if I have the reference genome sequence (its complete sequence genome, the CDS file with the sequences of the genes) and the sequence of a variant (its complete sequence genome, the CDS file with the sequences of the genes and, so, I have no reads) and I want to check where mutations are , so which genes had non synonymous mutations and which others synonimous etc,... how can I do ? Or can this kind of analysis be done only with reads of the query sequence (by "query" I mean my variant genome) ?

Hoping that it is not a very silly question, I thank you in advance.

software mutations bacteria genome archaea • 1.0k views

ADD COMMENT • link 3.4 years ago by Student ▴ 30

score 2 · Answer 1 · 2021-06-16

2

Entering edit mode

3.5 years ago

JC 13k

In your case, you need to align both genomes and check all mismatches to get the variants, in general, I would use Blast/ClustalO to align each chromosome and extract from the alignment the variants with Python/Perl scripts to output them as a VCF. Then, use VEP or similar to annotated such variants.

ADD COMMENT • link 3.5 years ago by JC 13k

0

Entering edit mode

Hi JC, thank you for your suggestion. I thought I had to do something like an alignment between the two sequences of interest but now you made the concept clear to me. However, this information is a bit poor for me to be able to put in practice :/

for example, for the first point: Blast/ClustalO to align each chromosome do you mean that I should align each sequence of the CDS file using BLAST ? And then download the txt file of the alignment ? For example I did this here in NCBI (for 4 sequences) and I would obtain a txt file of the alignment clicking on Download All. But doing this for each sequence or even 4 sequences of the CDS file would take too much time... if I paste all the CDS file, it tells that Your total query length is greater than allowed on the BLAST webserver. You can either reduce the size to 1,000,000 or less and try again or run stand-alone BLAST or our BLAST cloud option.. Should I install like BLAST+ ?
Do you know if there are some Python scripts on web that can do the conversion so extract from the alignment the variants to output them as a VCF ?

EDIT: I realized that, clicking on the link, you can not see what I did on NCBI BLAST. Anyway, just to understand, the file of the alignment that I mentioned has this beginning "framework" here

ADD REPLY • link 3.4 years ago by Student ▴ 30