Many redundant lines in a genes2GO mapping file, is it normal and will it influence my topGo enrichment ?
0
0
Entering edit mode
5.9 years ago
david.roux ▴ 40

Hello,

I am working on a RNAseq dataset and have to make a topGO enrichment (in R). I have downloaded the up-to-date mapping "genes2Go" file from a database. However, when exploring the mapping file, it looks weird since it contains many perfectly duplicated lines.

  1. Is this file ok?
  2. Will it influence the subsequent topGOenrichment of my data?
  3. Does the duplication mean something?

Thanks

Example:

Transcript ID   GO terms
XXXXX42200.1    GO0004252|GO:0006508
XXXXX42200.1    GO0004252|GO:0006508
XXXXX42200.1    GO0004252|GO:0006508
XXXXX42200.1    GO0004252|GO:0006508
XXXXX42200.1    GO0004252|GO:0006508
XXXXX02300.3    GO0005515|GO:0007165
XXXXX02300.3    GO0005515|GO:0007165
XXXXX02300.3    GO0005515|GO:0007165
XXXXX02300.3    GO0005515|GO:0007165
XXXXX02300.3    GO0043531
XXXXX02300.3    GO0005515|GO:0007165
XXXXX05700.2    GO0009058|GO:0016746
XXXXX05700.2    GO0003824|GO:0008152
XXXXX05700.2    GO0003824|GO:0008152
XXXXX05700.2    GO0009058|GO:0016746
XXXXX02300.1    GO0043531
XXXXX02300.1    GO0005515|GO:0007165
XXXXX02300.1    GO0005515|GO:0007165
XXXXX02300.1    GO0005515|GO:0007165
XXXXX02300.1    GO0005515|GO:0007165
XXXXX02300.1    GO0005515|GO:0007165
XXXXX13500.1    GO0003723|GO:0016787|GO:0030145
XXXXX13500.1    GO0016787
XXXXX13500.1    GO0016787
XXXXX13500.1    GO0016787
XXXXX13500.1    GO0016787
XXXXX13500.1    GO0016787
XXXXX13500.2    GO0016787
RNA-Seq topGO • 829 views
ADD COMMENT

Login before adding your answer.

Traffic: 1822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6