How can I pick out prokaryotic sequences from a multiple sequence alignment file?
I have a huge MSA file containing over 1,000 sequences which is in a typical clustal format. If I want to select those prokaryotes sequences from this msa file and create a new msa file. What should I do?
What are the names of the sequences? TXIDs? Organism name? The first step would be to get a list of the desired sequences' names.