How can I pick out prokaryotic sequences from a multiple sequence alignment file?
1
0
Entering edit mode
10.0 years ago
Chong Tang ▴ 100

How can I pick out prokaryotic sequences from a multiple sequence alignment file?

I have a huge MSA file containing over 1,000 sequences which is in a typical clustal format. If I want to select those prokaryotes sequences from this msa file and create a new msa file. What should I do?

MSA multiple sequences alignent • 2.4k views
ADD COMMENT
0
Entering edit mode

What are the names of the sequences? TXIDs? Organism name? The first step would be to get a list of the desired sequences' names.

ADD REPLY
0
Entering edit mode
9.9 years ago
Brett ▴ 150

A non sophisticated way of doing it, would be to cut and paste all the accession numbers into the PIR database http://pir.georgetown.edu/pirwww/index.shtml

This will bring up all the details for all the sequences.

Select from drop down box filters, and select TAXA ID. In this field simply enter 2 (for Prokaryotes) and the results will be filtered. Then click select all and then there is an alignment button (in Jalview) on top right of results.

Depending on your alignment naming/ID DAVID might be handy for converting the initial sequences into a format PIR can accept.

ADD COMMENT

Login before adding your answer.

Traffic: 2251 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6