Hey, it might seem a basic question but I wanted to find degs using RNA-seq data. First thing for finding degs should I take only tumor and tumor adjacent normal or should I compare the entire tumor samples even though for the patient I might not have normal samples against the few normal samples.
Further before finding DEGs the normalisation which we have to do has to applied on the entire samples and not only on those belonging to the same pateint?
At the moment I have a count matrix but as such if I look at total ensembl ids(row names) they are more than 60000. I understand they might contain noncoding regions so I map them to entrez genes but even after that I have some 24000 ids. Any further filtering which I need to apply.
Thanks in advance.
Thanks, 1)It was typing error genes were rows. 2)For normalisation do I need to exclude the other samples.
You can keep all the samples in for normalization.