Entering edit mode
5.4 years ago
heiman_zhang
•
0
Now, I have three reference genomes of difference sub-species which are not present in UCSC . They are very similar to each other. So I have three fasta files. And I want to find out all the conserved sequence between three genomes. How can I do that? What information and tools I need?
You can choose to look at it from an angle such as orthogroup finding or multiple whole genome alignment. Multiple whole genome alignment involves something like mauve or multiz UCSC pipeline. Orthologs might involve something like https://github.com/davidemms/OrthoFinder (which itself would involve doing gene prediction in each species separately maybe using something like MAKER)