Entering edit mode
2.6 years ago
prainbowltd
•
0
Hi,
I have got multiple FASTA files (each represents a section of different chromosome of a eukaryotic). I also have gene annotation (gff3 format) downloaded from NCBI data viewer.
My aim is to find the genes/regions which are common across the multiple fasta files. The task would also involve finding Repeats, Orthologues, paralogues e.t.c.
Any suggestion for bioinformatics tools for this task?
Thanks for reading the post.
Many questions into a single one, though. Your "task" would also depend on many things, such as the completeness of the genome, the specie, the annotation, etc. However, here are some suggestions:
Repeatfinder finding repetitive sequences complete and draft genomes
mummer MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form.
Orthofinder OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences.
MCL