I have a list of transcripts (quantified by Salmon), may I filtered them out based on biotype, for instance, protein_codings or lncRNA, then do GO for each one separately?
I have a list of transcripts (quantified by Salmon), may I filtered them out based on biotype, for instance, protein_codings or lncRNA, then do GO for each one separately?
Most of the gene ontology database or repositories or tools only cover protein_coding biotypes. I would recommend you to separate the protein coding genes from your analysis and perform gene enrichment analysis only for those. Though there are some specialized database annotate gene ontology or functional terms from lncRNAs. Unlike protein coding genes, lncRNA numbers are increasing drastically and there is no consistency in naming of lncRNAs between repositories, I would not prefer to perform functional enrichment analysis using lncRNAs.
Check out Gene Set Clustering based on Functional annotation (GeneSCF) to perform enrichment analysis using your protein coding gene list.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
could you please make some example of those specialized databases annotate gene ontology or functional terms from lncRNAs as well and which stage you recommend to separate protein_codings before DGE analysis or doesn't matter if done after that?
I would recommend you to perform DE analysis separately (since protein coding expression might create biasness due to its abundance).
Some database examples:
LncRNA Ontology
http://www.bio-bigdata.com/lncrnaontology/
LncRNA2Function
mlg.hit.edu.cn/lncrna2function/ (not working, may be contact the authors)
Just for your information: A: Any One please provide protocol for Analysing long noncoding RNA illumina NGS da