Libre office in ubuntu is changing my gene names to dates
1
0
Entering edit mode
5.4 years ago
cs18s003 ▴ 20

i have been working with a gene expression dataset and have observed that in my gene columns there are some of the genes like MAR ,SEPT getting cnverted into march and september . is there a way of getting this right ?please help

gene libreoffice • 1.2k views
ADD COMMENT
0
Entering edit mode

(untested) Why don't you save use something like awk to simply quote the gene names like "SEPT" in the original file to save them from being interpreted as dates?

ADD REPLY
0
Entering edit mode

but i want the entire name without loosing the number because sept1 sept2 are two different genes

ADD REPLY
0
Entering edit mode

Quote the entire gene name prior to reading into Excel/Libre to protect it from manipulation.

ADD REPLY
0
Entering edit mode

I went to a seminar on time series analysis and one of the first points made was to never open data in software like excel/libre. Do you know R or Python? You could start looking into how to work with tabular data with those languages if this is something you need to do a lot of.

ADD REPLY
1
Entering edit mode
5.4 years ago

Don't use Excel or related for scientific purposes. This is already documented by several papers:
- Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics”, BMC Bioinformatics 2004
- Gene name errors are widespread in the scientific literature. Genome Biology, 2016

Also discussed in a Nature job blog post which references this kludge for those who insist on using Excel:
Escape Excel: A tool for preventing gene symbol and accession conversion errors

and also discussed in this previous Biostars question.

This problem has been well documented for over 15 years now. When will people learn?

As for recovering from this, you could try this web based tool but frankly I would never be entirely confident that the correct identifiers have been restored.

EDIT: This is valid for LibreOffice which is a free software trying to clone Excel, good and bad features included.

ADD COMMENT
0
Entering edit mode

Thanks for the help .but in my case i feel the database itself was saved with genes converted into dates so i couldnt recover my gene names

ADD REPLY
0
Entering edit mode

In that case, you may want to see if this tool will help. Truke.

ADD REPLY
1
Entering edit mode

i had the data of where the gene is present in the genome i.e the locations and the chromosome number from that i used the genome browser of ucsc and recovered the names of the genes and have copied the entire gene column into a word pad file and applied replace all operation for that particular genes i have placed a quote before the gene name like 'MARCH1 'SEPT1 so on. now the problem is sorted

ADD REPLY

Login before adding your answer.

Traffic: 2033 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6