Hey everyone,
I've been working on RNA-seq analysis and ended up with a list of differentially expressed (DE) genes identified using edgeR. However, out of 100 genes, I could only find gene symbols for 20 of them using various tools.
The remaining genes are identified by unique locus_tag IDs, which couldn't be converted to other symbols. Despite this, all genes are associated with GO terms, using pannzer2.
I'm now looking for a method or tool to summarize and perform functional enrichment analysis, including identifying biological pathways and Gene Ontology (GO) enrichment.
Is there a tool that can do this based on either the gene sequences or their linked GO terms, since most of them like goseq, david, goprofile etc requires known gene symboles?
Thanks in advance for any suggestions!
If these genes do not have a symbol then it is utterly unlikely that anyone has looked at their function. I would get their Ensembl IDs and quickly paste it into gprofiler2. Probably nothing will come out.
Thanks for your answer, but as far as I know, gProfiler2 does not support bacteria. Could you please suggest alternative websites or software, where Ensembleare accepted. thanks
Since the locus-tags are unclassified, I used the CDS instead of the locus-tags to retrive their ensembl IDs using ensembl bacteria website/datatbse. The website allows to blast only 30 sequeneces at one time, and the task is to select the best corresponding/score ID for each gene/sequence.
Is there any alterantive approach that can I find the best score ensembl ID for all genes/CDs in the genome, instead of going through them 30 by 30 ?
ATpoint
Thanks in advance for any help