Question

What is required for genome annotation?

0

Entering edit mode

7.0 years ago

jjeangoh • 0

Hi! I'm biotechnology undergrad (wet lab) currently doing my master in bioinformatics. Now my project is regarding fungi genome assembly and annotation. I had done genome assembly and now struggling with annotation because I have no prior experience dealing with genomic data. I'm interested to become a bioinformatician during my 2nd year of the degree and fed up doing wet lab so I took this project. My current supervisor also has no experience doing hands-on bioinformatic work. His second option is to send out to annotate if there is no master student took his project. My external co-supervisor barely help me in this project as he also has a lot of bioinformatic related project to deal with.

Anyways, to do genome annotation is it require to have genome assembly data, transcriptomic data and protein data? I only have my genome assembly data. The fungal genome I assembled is new and it was not found in the NCBI database. How can I get transcriptomic data and protein data? Which database and how am I going to BLAST it?

Sorry if the question sounds so simple yet I can't do it or even Google it! This project had stressed me out and I feel like giving up on pursuing this bioinformatic career. Are there any websites/resources that can help me on this path? Thanks for your help!

annotation genome • 1.6k views

ADD COMMENT • link 7.0 years ago by jjeangoh • 0

1

Entering edit mode

So you already did the assembly. Let's assume that this is all correctly done.

You can use MAKER2, MAKER: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-491

NCBI also has some tools for it: https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/

You can make your own pipeline, and I believe you need to start with gene prediction: https://en.wikipedia.org/wiki/List_of_gene_prediction_software. After that you can blast the predicted genes

ADD REPLY • link 7.0 years ago by gb ★ 2.2k

1

Entering edit mode

Hi there,

Could you please provide more information about the data

What stage is the genome assembly? e.g. scaffolds or draft genome?

Is it a new species or a new strain of fungi?

Is there an annotated genome of close species/strain available?

Although I am not an expert in fungal bioinformatics, Google searches have returned following links that might be of interest to you.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3207268/

https://github.com/CompSynBioLab-KoreaUniv/FunGAP

fungal genome annotation

ADD REPLY • link 7.0 years ago by Sej Modha 5.3k

0

Entering edit mode

What stage is the genome assembly? -the assembly produces only contigs

Is it a new species or a new strain of fungi? -new species which have only few genome strains in database

Is there an annotated genome of close species/strain available? -Yes

My problem is how to find transcriptomic reads as my data is not transcriptomic data

ADD REPLY • link 7.0 years ago by jjeangoh • 0

1

Entering edit mode

See all the links, AUGUSTUS is an easy tool. They also have a web version http://bioinf.uni-greifswald.de/augustus/submission.php

If this is not a good solution: https://en.wikipedia.org/wiki/List_of_gene_prediction_software

ADD REPLY • link 7.0 years ago by gb ★ 2.2k

1

Entering edit mode

You can use Funannotate - it is flexible on input requirements, only thing necessary is an assembly. Docs are here: http://funannotate.readthedocs.io/en/latest/. Github here: https://github.com/nextgenusfs/funannotate

ADD REPLY • link 7.0 years ago by Jon ▴ 360