Duplicated genes symbol from supplementary information paper
1
0
Entering edit mode
5.2 years ago
delacroixed ▴ 10

Hello everyone,

Recently, I downloaded this table:

https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1001154#s5

But I realized there are 115/18,931 gene symbols which are duplicated (or repeated several times some of them).

I was wondering what is the best way to proceed.

Thank you in advance.

Francisco Requena

haploinsufficiency • 1.3k views
ADD COMMENT
0
Entering edit mode

How are the chromosomal locations of those repetitive gene symbols (same or different location) ?

And for your statement "I was wondering what is the best way to proceed", it is impossible to answer unless you explain what you like to do with those genes.

ADD REPLY
0
Entering edit mode

Hello! Thank you for your fast reply. I have checked their locations and they are distributed across the genome. This score (along with others) will be displayed in a software tool for clinician use. Since there are genes duplicated, if the user searches for any of those genes, it will be displayed two rows (with the same information but the HI score different)

ADD REPLY
0
Entering edit mode

I think that EagleEye was asking if, given any duplicate pair of gene symbols, do they have the same genomic co-ordinates? Also, can you provide an example of such a gene symbol pair?

ADD REPLY
0
Entering edit mode

First, you didn't link to a table but to the list of supplementary material of the paper. Second, there are two tables there and both have fewer than 18000 lines (so presumably fewer gene symbols) and don't appear to have duplicated gene symbols. Could it be that you're talking about another data set or paper?

ADD REPLY
1
Entering edit mode
5.2 years ago
delacroixed ▴ 10

I have checked again the raw data and I noticed the error for the duplicated symbols. Symbols which have a dash symbol (-) and a number next to it (e.g. KRTAP13-1, KRTAP13-2, KRTAP13-3...) are trimmed by the end (e.g. KRTAP13) in my script. I have solved the problem. There is not any problem with the dataset. Thank you!

ADD COMMENT

Login before adding your answer.

Traffic: 1836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6