Mapping Gene IDs (e.g., Solyc00g500001.1) to Gene Names

Question

Getting a list of up- and downregulated genes using DESeq2

0

Entering edit mode

3 months ago

duplessisantzel • 0

Hello.

I have successfully completed my DESeq2 and would now like to generate a list of upregulated and downregulated genes. I have the MA plot which shows me how many have been up- and down regulated. But now how do I get the gene ID/names of those? I see the gene ID is written as follows: gene:Solyc00g500001.1 ( I am working with tomatoes)

geneID DESeq2 • 533 views

ADD COMMENT • link updated 3 months ago by Istvan Albert 102k • written 3 months ago by duplessisantzel • 0

score 1 · Answer 1 · 2024-09-05

1

Entering edit mode

3 months ago

Istvan Albert 102k

Typically the fold change is computed as the B/A that is the expression levels of the condition (B) vs the control (A) (in a two sample comparison).

So every fold change that is larger than 1 (or positive on the log scale) is upregulated in B relative to A and every fold change less than 1 (or negative on the log scale) is downregulated in B relative to A.

Take your table and separate it by foldchange.

ADD COMMENT • link 3 months ago by Istvan Albert 102k

0

Entering edit mode

You are talking about this table bellow?

How do I then get the gene names from these gene:Solyc...? Do you possibly know how I can convert them to the specific gene names?

results

ADD REPLY • link 3 months ago by duplessisantzel • 0

0

Entering edit mode

you should solve one problem at a time, that way you make better progress, at least then you know where you get stuck

first you have the problem of selecting up/downregulated ids,
then you seem to have the problem of mapping that ID to a more meaningful gene name.

If your table is already in R then write the few line code to select up and downregulated elements. Any large language model can give you that code for that.

Once you have that you want to map the gene ids to names, I don't know much about the tomato genome, but again look at biomart or some other resource to remap the gene ids.

Here is a nice solution from ChatGPT that seems legit, though I have not verified these instructions

Mapping Gene IDs (e.g., Solyc00g500001.1) to Gene Names

To map gene IDs like Solyc00g500001.1 (which appears to be from the tomato genome, Solanum lycopersicum) to gene names, you can use a few different approaches depending on the resources available for the organism you're working with. Below are some strategies:

1. BioMart (Ensembl Plants)

Ensembl Plants provides a BioMart tool that allows you to map IDs to gene names.
Here's how you can do it:
1. Go to Ensembl Plants BioMart.
2. Choose Solanum lycopersicum as your dataset.
3. In the “Filters” section, select the gene IDs of interest (like Solyc00g500001.1).
4. In the “Attributes” section, choose fields such as "Gene stable ID" and "Gene name".
5. Export the results.

2. NCBI Gene

NCBI Gene is another option. You can search for the specific gene ID in the NCBI Gene database for Solanum lycopersicum and look for associated gene names or descriptions.
1. Go to the NCBI Gene database.
2. Enter the gene ID (e.g., Solyc00g500001.1).
3. The page should provide the gene name and other annotations.

3. Phytozome

Phytozome is a database of plant genomes where you can search for genes by their ID and retrieve corresponding gene names and functions.
1. Search for your gene ID in the tomato genome.
2. Phytozome typically provides gene names, functional descriptions, and orthologs.

4. Custom Annotation Files

If you have a local annotation file (e.g., a GFF3 or GTF file from the tomato genome project), you can extract the gene names from there.
Tools like grep, awk, or dedicated parsers in Python or R can help you map gene IDs to gene names from these files.

Example: ```bash grep "Solyc00g500001.1" annotation.gff3 | awk '{print $9}'

ADD REPLY • link 3 months ago by Istvan Albert 102k

0

Entering edit mode

Thank you very much for your assistance, I really appreciate it. I'm quite new with all this, only been busy with this for a few months so still learning R and DESeq.

ADD REPLY • link 3 months ago by duplessisantzel • 0