Hello.
I have successfully completed my DESeq2 and would now like to generate a list of upregulated and downregulated genes. I have the MA plot which shows me how many have been up- and down regulated. But now how do I get the gene ID/names of those? I see the gene ID is written as follows: gene:Solyc00g500001.1 ( I am working with tomatoes)
You are talking about this table bellow?
How do I then get the gene names from these gene:Solyc...? Do you possibly know how I can convert them to the specific gene names?
you should solve one problem at a time, that way you make better progress, at least then you know where you get stuck
first you have the problem of selecting up/downregulated ids,
then you seem to have the problem of mapping that ID to a more meaningful gene name.
If your table is already in R then write the few line code to select up and downregulated elements. Any large language model can give you that code for that.
Once you have that you want to map the gene ids to names, I don't know much about the tomato genome, but again look at biomart or some other resource to remap the gene ids.
Here is a nice solution from ChatGPT that seems legit, though I have not verified these instructions
Mapping Gene IDs (e.g., Solyc00g500001.1) to Gene Names
To map gene IDs like
Solyc00g500001.1
(which appears to be from the tomato genome, Solanum lycopersicum) to gene names, you can use a few different approaches depending on the resources available for the organism you're working with. Below are some strategies:1. BioMart (Ensembl Plants)
Solyc00g500001.1
).2. NCBI Gene
Solyc00g500001.1
).3. Phytozome
4. Custom Annotation Files
Tools like
grep
,awk
, or dedicated parsers in Python or R can help you map gene IDs to gene names from these files.Example: ```bash grep "Solyc00g500001.1" annotation.gff3 | awk '{print $9}'
Thank you very much for your assistance, I really appreciate it. I'm quite new with all this, only been busy with this for a few months so still learning R and DESeq.