I have TPM expression data from RNA-seq data analysis. The data comprises of not only protein coding genes but also several other biotypes like miRNA, lncRNA, pseudogene etc making the matrix genes around 60,000. Here, should I filter by data with biotype='protein coding`? I am using the hallmark gene list from msigdb for enrichment.
Gene sets that are specific to non-coding RNAs are needed for gene set enrichment analysis (GSEA), especially for entities like miRNAs and lncRNAs. Although it is a major source of gene sets, the Molecular Signatures Database (MSigDB) primarily contains genes that code for proteins. On the other hand, the following databases and services include gene sets that have been especially selected for non-coding RNAs:
miRNA-related Gene Sets:
miRTarBase: Experimentally verified miRNA-target interactions are contained in miRTarBase. Using this, gene sets based on the targets of particular miRNAs can be created.
TarBase: An online resource for functional annotations and miRNA target prediction is called miRDB. On the basis of anticipated miRNA targets, users can construct gene sets.
miRDB: An online database for miRNA target prediction and functional annotations. Users can create gene sets based on predicted miRNA targets.
Pharmaco-miR: Gene sets connecting miRNAs to medication effect are provided by pharmaco-miR, which may be helpful for pharmacogenomics research.
lncRNA-related Gene Sets:
lncRNAdb: A database devoted to long non-coding RNAs is called lncRNAdb. Although it doesn't offer gene sets directly, it can be a useful tool for creating unique gene sets.
LNCipedia: A thorough database of human long non-coding RNAs is provided by LNCipedia, which is helpful for building gene sets based on the expression or function of lncRNAs.
NONCODE: A comprehensive knowledge base for non-coding RNAs (not including tRNAs and rRNAs) that can be utilized to create unique gene sets.Gene sets that are specific to non-coding RNAs are needed for gene set enrichment analysis (GSEA),
Just add a statement to your comments stating that the text was generated using ChatGPT - it is a good tool but it's not aware of its own shortcomings so you'll be blamed for any of its mistakes.
Did you use ChatGPT (or any other LLM) for this answer? Have you used ChatGPT (or any other LLM) for any of your previous posts?
@Ram there are 2 post from me that are that way...stupidity of mine this one and "p-value combination methods" can you remove those comments ???
Just add a statement to your comments stating that the text was generated using ChatGPT - it is a good tool but it's not aware of its own shortcomings so you'll be blamed for any of its mistakes.