Hello,
I am new to bioinformatics, so please excuse me if I am asking a simple question.
I have a few pair of breast tumor and normal RNA-seq data. I performed differential gene expression analysis using DESeq2 and I have a long list (~1200) of genes being differentially expressed. I also have access to RNA-seq data from 10 other normal tissues (kidney, heart, brain, lung, etc). My adviser asked me to find pair of genes that are differentially over-expressed in breast tumors and can distinguish between the tumor and normal cells.
I do not have an algorithm background, so I am not sure how to do this. It would be amazing if you help me into the right direction to figure this out? is there a tool that can do this?
Thank you
I'd stick your dataset in the drawer for now and use that to validate a discriminant that you build on external breast cancer datasets
You might want to just try some clustering approaches. You could try co-expression methods, like WGCNA.
Unless you have hundreds or thousands of normal and tumor samples, you're going to have a very hard time tackling such a high-dimension problem. I think trying to use more powerful machine learning approaches aren't worth it without the data.