Entering edit mode
13 months ago
dec986
▴
380
I am analyzing a fungal species that has no proteome available, only a genome (https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_001567575.1/)
I would like to convert this genome to an estimated proteome, I have searched on google, but nothing shows up.
how can I make a proteome from this genome?
You will need to use one of the eukaryotic gene identification programs (GeneMark, GENSCAN, Exonerate etc) to identify the genes and then translate the gene models into proteins. Depending on the quality of genome (looks like there are 950+ contigs in what you linked above) your mileage on the quality is going to vary. Since you don't seem to have independent EST/RNAseq data there is no good way to validate those predictions.
What do you mean by an estimated proteome? You could use any of the ab initio annotation pipelines, but this can be a fairly involved process. There is no shortcut to annotating an entire proteome that I know of.
an estimated proteome, like https://ftp.ncbi.nlm.nih.gov/genomes/refseq/fungi/Verticillium_dahliae/latest_assembly_versions/GCF_000150675.1_ASM15067v2/GCF_000150675.1_ASM15067v2_protein.faa.gz except, predicted as no such proteome was measured.
You could use an ab initio approach such as with
Augustus
(https://bioinf.uni-greifswald.de/augustus/), haven't tried it myself so don't know how well it works. Good luck!unfortunately, the species that I need isn't listed on one of the available species on augustus :(