Dear friends,
I am a cookie in RNA-Seq analysis so I am really confused with the Ensembl id and gene symbol. I have checked some associated posts in Biostars but I didn't find what I want eventually. The problem is that I noticed one Ensembl id matches multiple gene symbols, but I do not know how to deal with this issue when I want to do some analysis based on gene symbols. Should I add the counts of the same symbol together? Thanks in advance for the reply!
Hello dz2353 ,
could you please post an example?
fin swimmer
Are you sure its "multiple ensemble ids for a gene symbol" or "multiple gene symbols for an ensemble id"?
Sorry, I made a mistake. Thanks for @arup's comment. I correct my question as an Ensembl id matches multiple gene symbols. And I post a pic for my data and you can see that there are so many RF00019. Should I add all of them together? ! https://ibb.co/9T1xkBD
the issue is with the .decimal point after your ensg ID remove them .
I know that the part after decimal point represents the version. So what I need to do is to delete the parts after .decimal point and then transfer the Ensembl id to gene symbol? Thanks in advance for the reply!
yes..thats what you have to do
Thank you! I will try it.