Replace ambiguous characters in fasta MSA
0
0
Entering edit mode
15 months ago
Alexandre • 0

Hi everyone,

I have a MSA that I feed into a software that does not deal with Ns and many of the sequences of my MSA (~20%) have at least a couple of them

I am looking for a program that can compute the most likely state for each of the Ns in my alignment but I am not sure what to look for, maybe a phylogenetic software has that ability One important thing is that I want to keep the gaps in my alignment, they are important for the rest of the analyses.

I hope I was able to make myself clear

Thank you Alex

maximum-likelihood DNA • 502 views
ADD COMMENT
0
Entering edit mode

I would expect you could do this by building an HMM and using hmmemit (it would certainly work for protein, I have never tried with nucleic acid).

The main question would be how big the MSA is and how many sites have informative information to yield a prediction (if the column is all N, you can't magic up a guess of what else should be there).

ADD REPLY

Login before adding your answer.

Traffic: 2608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6