Hi, I hope you are doing well. I ran BLAST alignment on a multi-gene FASTA file and return the top hit for each gene as a refseq protein ID (such as NP_001229937.1). I want to convert these protein IDs to Entrez Gene Accessions or Ensembl IDs. Is there a way to do so programmatically? I am working in R. I tried Biomart but it returned no matches for some of the input refseq protein IDs.
Thank you! Is there a way to pass in a file with multiple ref seq IDs at once? Such as instead of query I were to use -Input "file name". I just need the gene symbol (and not the other information) for each Ref_seq ID. My final goal is a table of gene symbols corresponding to the input file of ref seq IDs.
Use something like (file with one accession per line,
file.txt
) :To get GeneNames:
Hi thank you for the quick reply! The second command you shared is exactly what I need, returning the gene symbol. However, I am not sure how to pass in a file of multiple sequences (one accession per line) to this command. Could you explain how to do something like this?
file.txt
should contain one accession per line.