Hi All!
I am really new to this field, so I hope some Bioinformatics/ Biologist may help me, even if my question probably sounds silly to you.
I want to build a Coexpression network (using CEMiTool and/ or WGCNA). I think I now understood the basic concepts and I´ve already build my network. But now I want do do functional enrichment analysis and also include gene interaction data with CEMiTool.
My problem: Where do I get the gene sets in .gmt format from and where do I get the gene interaction (protein interaction) data from? At this point I think I don´t really get the difference/ purpose of all the different databases and appraoches. In WGCNA for example, they seem to use DAVID. In CEMiTool, I´ve read about GeneMania and reactome.
My data is microarray expression data from human blood cells (monocytes), 60 humans divided into 2 sample groups. With filtering I have ~5000 input transcripts for the network in CEMiTool. Would I use the Gene Symbols of these transcripts as input for the databases, to generate the tab and gmt files?
Thank you for any help!
Hello Pedrostrusso I reckon you're the first author on the CEMiTool paper. Which pathways are used in the pathways.gmt file in the extdata folder of the package? Thanks
Hi @ko2427, yes, we used Reactome pathways, however they were edited in order to fit inside the package, so we don't recommend using them.