Hi!
I would like to seek an advice or two with regard to the proper format of GENBANK_GI_ACCESSION for DAVID Function Analysis. I tried these formats:
- gi123456
- gi|123456
- 123456
Sadly, nothing worked. I could not find any examples. I prefer gi accessions for this because all my unigenes of interest have them. I do have some ref seq counterpart and symbol IDs but not all of my unigenes have a ref seq or symbol IDs. And yes, I'm not sure what I'm doing. Any help will be appreciated!
Thank you for the prompt answer. I do have the accessions; however, I am not sure how to use different input types for the analysis in DAVID. For example I have ref (mostly XP_), dbj, gb, or sp for the accessions. How do I convert these accession IDs to a DAVID-"interpret-able" format? I am having a hard time looking for a way to do this. For example, I tried the Retrieve/ID Mapping of uniprotkb but not all my unigenes with gi matched to a uniprot. I do not know how to proceed from there.
DAVID ought to understand many different types of accession numbers. Try something simple first, use only a subset of the gene names, to get your bearings first, and ensure that it works. If you are not sure what to pick start here
http://data.biostarhandbook.com/redo/zika/zika-up-regulated.csv
take the 20 gene names from the first column and see if you can make DAVID work.
I would also recommend an alternative tool
https://biit.cs.ut.ee/gprofiler/gost
and the converter here
https://biit.cs.ut.ee/gprofiler/convert
I have the Official Gene Symbols, which actually works. My confusion comes from the use of the unigenes with nr hits but with no gene symbols. Should I just proceed with those which had gene symbols? This is why I was looking for a way to get all these unigenes with nr hits to have other accession numbers (e.g. gene symbols, uniprot) to represent them all. I'm not even sure if this is possible, though.
As I have mentioned, most of my unigenes had protein hits with XP_ but some with gb| or dbj| reference instead. The gi number is the only ID that is present in all my unigenes with nr hits. Curious question: Is it possible to convert all unigenes with nr hits into their corresponding Uniprot or Gene symbol IDs? I'm asking because I could not seem to find XP_ counterpart for those with gb| or dbj| (e.g. gb|ABC87995.1). Or is it okay to proceed with the analysis with just those unigenes with gene symbols? I'm so confused I don't know if my questions are valid.