Convert Gene ID To Gene Name
0
0
Entering edit mode
3 months ago
min • 0

I'm trying to convert a list of Gene IDs to the Gene Symbol. However, it seems that I cannot use library("AnnotationDbi"). I used featureCounts to generate count_data.csv from a GTF file obtained from NCBI. Here is my count_data.csv:

count_data.csv

Could someone help me with how to map Gene IDs to Gene Symbols using a different approach?

I would greatly appreciate any guidance.

geneid • 817 views
ADD COMMENT
0
Entering edit mode

Please do not paste screenshots of plain text content, it is counterproductive. You can copy paste the content directly here (using the code formatting option shown below), or use a GitHub Gist if the content volume exceeds allowed length here.

code_formatting

ADD REPLY
0
Entering edit mode

I will pay attention next time.

ADD REPLY
0
Entering edit mode

what species are you looking at? you could use biomaRt data mining tool from Ensembl - https://www.ensembl.org/biomart/martview. You can copy the file containing gene IDs and ask biomaRt to give you the corresponding gene symbols. It is difficult to help you with the AnnotationDbi package since you do not mention what your error is.

ADD REPLY
0
Entering edit mode

Thank you for your response. I am analyzing a strain of S. aureus, but my strain is not available on ENSEMBLbacteria.

ADD REPLY
0
Entering edit mode

hello,

You can obtain gene symbols/gene names from a GFF file based on the features for which you have read abundance data

Example: if you get the read abundance based on transcripts using featurecounts,you can extract the transcript-related information, which includes transcript IDs, gene IDs, gene names, and descriptions from GFF then match the gene symbols/gene names to the identified transcripts in your matrix.

ADD REPLY
0
Entering edit mode

Thank you for your response. I tried this method and it worked. However, many GeneIDs do not include Gene names in the GTF file. How can I resolve this issue?

ADD REPLY
0
Entering edit mode

Then for those genes, you can try to get gene names from Uniprot database for each gene IDs,specific to the strain.

One more,you can try out biodbnet website [https://biodbnet-abcc.ncifcrf.gov/db/db2db.php] for organism specific gene names. Input will be gene IDs and Tax ID of the organism.

cheers!

ADD REPLY
0
Entering edit mode

I realize that my issue is more complex, I will make a new post and describe it in detail.

ADD REPLY
0
Entering edit mode

Which genome are these ID's from?

ADD REPLY
0
Entering edit mode

Thank you for your response. These ID's from my GTF file.

ADD REPLY
0
Entering edit mode

If gene ID's are not available in GTF you will need to do additional work to figure our what gene it could be. This could minimally involve blasting the protein sequences and identifying the genes. You could also try tools like LiftOffTools (LINK) or RATT (RATT) if annotations are available for a closely linked genome relative.

ADD REPLY

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6