How to tell if a proteome is estimated or measured?
1
0
Entering edit mode
4 months ago
dec986 ▴ 380

I am looking through data through many genomes and proteomes, for example https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_902167145.1/

But I'm wondering, how can I tell whether or not this proteome is predicted or measured? I don't see anything in that link.

I know from Proteomes: measured or predicted from genomes/transcriptomes? that most proteomes are estimated, few are measured, but which one am I seeing with Zea mays?

Normally, I can see >NP_043107.1 hypothetical protein ZemaCp106 (chloroplast) [Zea mays]

which is probably a prediction from the genome, but many other genes are like

>NP_001104839.1 squalene synthase 1 [Zea mays]

perhaps the latter is measured, and the former is predicted from genome?

Is there any indicator on NCBI that would tell me either way?

If I look at the annotation report: https://www.ncbi.nlm.nih.gov/refseq/annotation_euk/Zea_mays/103/#ProteinCoverageStats I see that there are 34,337 protein-coding genes, but are these directly measured or not?

genome proteome • 335 views
ADD COMMENT
2
Entering edit mode
4 months ago
dthorbur ★ 2.6k

You can see where the annotations come from in the annotation report. In this scenario it appears they are provided by NCBI.

The RefSeq genome records for Zea mays were annotated by the NCBI Eukaryotic Genome Annotation Pipeline

The report will includes the source of the transcript and proteins used in alignments to generate the annotations. In this case, for RNAseq data, there were samples from multiple different BioProjects that contributed. I haven't checked the original submission, but if the authors provided any transcript or protein data, these would also be used.

To my knowledge there isn't going to be a breakdown of which RNA/Protein sample supports each annotation, so I don't think you'll be able to find which annotations are from data generated by the authors directly from the reference sample. In the case where it's not NCBI that annotated the genome, you'll have to go through the associated publication to see methodology.

That said, since so many BioProjects contributed to the annotation of this genome, I would consider all of these annotations as predictions, even if the original protein was isolated from a different Zea mays strain.

ADD COMMENT

Login before adding your answer.

Traffic: 2796 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6