Entering edit mode
5.9 years ago
david.roux
▴
40
Hello,
I am working on a RNAseq dataset and have to make a topGO enrichment (in R). I have downloaded the up-to-date mapping "genes2Go" file from a database. However, when exploring the mapping file, it looks weird since it contains many perfectly duplicated lines.
- Is this file ok?
- Will it influence the subsequent topGOenrichment of my data?
- Does the duplication mean something?
Thanks
Example:
Transcript ID GO terms
XXXXX42200.1 GO0004252|GO:0006508
XXXXX42200.1 GO0004252|GO:0006508
XXXXX42200.1 GO0004252|GO:0006508
XXXXX42200.1 GO0004252|GO:0006508
XXXXX42200.1 GO0004252|GO:0006508
XXXXX02300.3 GO0005515|GO:0007165
XXXXX02300.3 GO0005515|GO:0007165
XXXXX02300.3 GO0005515|GO:0007165
XXXXX02300.3 GO0005515|GO:0007165
XXXXX02300.3 GO0043531
XXXXX02300.3 GO0005515|GO:0007165
XXXXX05700.2 GO0009058|GO:0016746
XXXXX05700.2 GO0003824|GO:0008152
XXXXX05700.2 GO0003824|GO:0008152
XXXXX05700.2 GO0009058|GO:0016746
XXXXX02300.1 GO0043531
XXXXX02300.1 GO0005515|GO:0007165
XXXXX02300.1 GO0005515|GO:0007165
XXXXX02300.1 GO0005515|GO:0007165
XXXXX02300.1 GO0005515|GO:0007165
XXXXX02300.1 GO0005515|GO:0007165
XXXXX13500.1 GO0003723|GO:0016787|GO:0030145
XXXXX13500.1 GO0016787
XXXXX13500.1 GO0016787
XXXXX13500.1 GO0016787
XXXXX13500.1 GO0016787
XXXXX13500.1 GO0016787
XXXXX13500.2 GO0016787