Understanding blast2go annotation result
1
0
Entering edit mode
3.7 years ago
Kash ▴ 110

Hi,

I am new to functional annotation. I am annotating set of genomic sequences. I have following concerns when trying to understand blast2go combined GO graphs. Any help is appreciated.

  1. Are only the sequences with annotated GO terms used to generate combined GO graphs?

  2. Are only mapped GOs for blasted sequences are annotated?

  3. Aren't the sequences with only Interproscan Go terms but no blast mapped Go terms get annotated?

  4. Some annotated sequences have Go terms from all three categories (MF, BP, CC). Does this mean these sequences appear in all three types of Combined GO graphs (MF, BP, CC)?

  5. Is it biologically possible for a single sequence to have all three annotations MF, BP, CC? Some times more than one MF or BP or CC

Thank you in advance.

combinedGOgraph annotation blast2go • 1.2k views
ADD COMMENT
1
Entering edit mode
3.7 years ago

To clarify something, sequences are not annotated, gene products are. Those gene products also have sequences associated with them. It is an important distinction, especially when using BLAST to annotate.

A gene or gene product designated by a label say TP3 may have zero or more annotation in each of the GO domains: Cellular Component, Biological Process and Molecular Function. The most annotated gene product at this time has over one thousand gene ontology terms associated with it. Here are the top three genes (by annotation count) for human genome:

1097    HTT
 951    TP53
 818    EGFR

The least annotated gene products (and there are thousands of them) have ZERO gene annotations. So you see the discrepancy there.

In theory every gene and gene product should have at least one entry in each of CC, MF, BP - realistically speaking every gene has multiple functions, hence it is more realistic to say that every gene should have dozens if not hundreds of entries in each of the categories.

When you perform BLAST2GO your sequences are matched (usually partially) to one more sequences that may have gene products annotated in GO. Those that partially match and happen to have annotations in the database will get processed into a report that is a good start.

ADD COMMENT

Login before adding your answer.

Traffic: 2754 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6