Dear All, I am new for DNA-Seq. I have sequenced file but my reference file (aspergillus flavus) not that much valid. I want to perform genome annotation. what is the most trustable pipeline (softwares)? please help me
Dear All, I am new for DNA-Seq. I have sequenced file but my reference file (aspergillus flavus) not that much valid. I want to perform genome annotation. what is the most trustable pipeline (softwares)? please help me
We use http://www.softberry.com/berry.phtml?topic=index&group=programs&subgroup=gfind Also we used MAKER http://www.yandell-lab.org/software/maker.html and Augustus http://bioinf.uni-greifswald.de/augustus/
You may use BUSCO http://busco.ezlab.org to assess quality.
Also it is not that much of the question of the tool selection, rather it is mainly about the way you use it, how you train parameters and your post processing.
Yes, software parameters, genome annotation software uses different data like nucleotide composition at different features from nearby species and can use additional information from your organism, like transcriptome data. These datasets are used to adjust parameters in different statistical models used to predict different aspects of the genes.
If you have the support of transcript data, then Genemark is a good option. Else as suggested by Petr, Augustus and Maker would be apt. The braker pipeline is well designed if you have no close species annotated, since augustus trains with your own data.
If you are working with a model organism with other sub-species well annotated, you can give Gemoma a try. Their recent V1.4 looks promising.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/
Thank you for your answer. I have a question is it possible to do manually?
No problem. I don't have any experience in genome annotation so hopefully someone else can chime in.