Dear all,
I have been working on several MSA files (3000 orthologs Msa files in phylip format of 7 genomes) in which I have to get a highly conserved sequence on the basis of certain parameters. I want to write a perl script to fulfill all my needs, I know Gblocks (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) generates filtered alignments, but that's is not much flexible and doesn't fits to what I want. I want to ask if there is any robust way to read an MSA in a perl data structure, preferably using hash, and process the aligned stretches? Currently I am using perl array and treating alignment as 7 X l matrix, and processing each aligned position using nested for loop.
Best regards, Rahul
You can do that in Perl, either using an array of arrays or a hash of arrays. There are many possible ways.
I think Trimal may be of your interest, it's a really versatile tools and you can filter by % gaps, % conservation, etc: http://trimal.cgenomics.org
Many thanks for your kind reply! I did that with perl arrays.
Regards, Rahul