Entering edit mode
6.7 years ago
khhgng
▴
70
I have a fasta format file with sequences of one gene from 1000 Arabidopsis accessions.
What would be the best way to cluster and visualise these sequences on basis of SNPs among them ?
Many thanks
Multiple sequence alignment using t-coffee, MAFFT, muscle.
Thanks. So in that case what's a better approach for making a tree considering very high sequence similarity - Neighbour joining or parsimony or ...?
Depends on what you want to do. If you only want to visualize the SNP's then you don't need to make a tree.
I would rather want to cluster them based on SNPs. That may bring more sense to the evolutionary part regarding this one locus.
And that may automatically happen. Run the MSA and then you could edit it (within reason) to show what you want to demonstrate.
If your alignment is good/close it probably won't make a difference what method you use.