Entering edit mode
6.2 years ago
KG
▴
10
Hi,
We have generated a newer version of genome assembly for a yeast species. We have also sequenced the transcriptome. Now we would like to annotate the genome. My questions are:
- How to use the transcriptome data for annotation?
- Can you recommend a pipeline for genome annotation which use transcriptome data for functional validation of annotated features?
Thank you for your time and help.
Are you looking for a validation of predicted transcripts? You could try a genome-guided transcript assembly, then map the transcripts back to the genome and compare the predicted to the assembled transcripts.
Wrt. to "functional validation" are you referring to the gene function? Validated functional annotation would require functional assays like knock-down, knock-out, localization, overexpression, binding assays, etc. or do you want to infer function based on gene expression pattern?
I think Augustus can take transcript data as input. I would go with Michael's approach (genome-guided transcript assembly) as well as denovo transcript assemblies with different assemblers and then maybe have a look at Mikado.
When you have a set of transcripts with support by your RNA-seq, you can then go down the traditional bioinformatics route for functional annotation (however, as Michael says, this is rather functional prediction). I.e., you will predict ORFs (e.g. with transdecoder) and then could use InterProScan for the functional annotation. Additionally, (if those are not already included in InterProScan) you could use hmmscan and blastp on the translated ORFs, blastn the untranslated ORFs to find ncRNAs, etc.