Question

Genome annotation using transcriptome data

0

Entering edit mode

6.2 years ago

KG ▴ 10

Hi,

We have generated a newer version of genome assembly for a yeast species. We have also sequenced the transcriptome. Now we would like to annotate the genome. My questions are:

How to use the transcriptome data for annotation?
Can you recommend a pipeline for genome annotation which use transcriptome data for functional validation of annotated features?

Thank you for your time and help.

genome annotation transcriptome • 2.5k views

ADD COMMENT • link updated 3.5 years ago by sagnik ▴ 50 • written 6.2 years ago by KG ▴ 10

0

Entering edit mode

Are you looking for a validation of predicted transcripts? You could try a genome-guided transcript assembly, then map the transcripts back to the genome and compare the predicted to the assembled transcripts.

Wrt. to "functional validation" are you referring to the gene function? Validated functional annotation would require functional assays like knock-down, knock-out, localization, overexpression, binding assays, etc. or do you want to infer function based on gene expression pattern?

ADD REPLY • link 6.2 years ago by Michael 55k

0

Entering edit mode

I think Augustus can take transcript data as input. I would go with Michael's approach (genome-guided transcript assembly) as well as denovo transcript assemblies with different assemblers and then maybe have a look at Mikado.

When you have a set of transcripts with support by your RNA-seq, you can then go down the traditional bioinformatics route for functional annotation (however, as Michael says, this is rather functional prediction). I.e., you will predict ORFs (e.g. with transdecoder) and then could use InterProScan for the functional annotation. Additionally, (if those are not already included in InterProScan) you could use hmmscan and blastp on the translated ORFs, blastn the untranslated ORFs to find ncRNAs, etc.

ADD REPLY • link 6.2 years ago by cschu181 ★ 2.8k

score 0 · Answer 1 · 2021-05-27

Hello,

We have developed a gene annotator called FINDER which can annotate eukaryotic genomes using short-read RNA-Seq reads and protein sequences. It is completely automated and requires no manual intervention. FINDER also runs BRAKER to incorporate predicted genes in the repertoire. You can access the paper from FINDER and the software from here GitHub.

Thank you.