I input a count matrix which does not contain any gene id information in it to deseq. I obtained the result table.
Now, how can i obtain the row numbers corresponding to a particular p value, as the row numbers corresponds to gene id in the original data set. Please help.
How can i verify that the subset of genes selected by deseq2 is the best subset?
How will you get any info if you are using a count matrix without gene id?
Like i already said, row 1 corresponds to first gene in the original dataset. Likewise row 20530 corresponds to 20530th gene. Okay, can you tell me how to feed a data set in excel format to deseq2? Thanks.
Why you removed the gene name in the first place? If you are sure about 20530 will map to 20530th gene then merge the gene_id with res.
I need to train a classifier using the output of deseq2 which is why i decided to go with row numbers that way it is easy for me to extract them. how do you suggest i go ahead?
what about the 2nd part of my question? GSEA perhaps?
your earlier question states that you've got gene ids present in your excel files. Just add them back into your dataset. Presumably your counts are stored in a DGEList (or similar). So you can just add them into the
genes
entry of that DGEList, or put them back into the rownames. (ps, GSEA will not validate anything about a classifier; all you can do is split up your dataset and cross-validate or compare to a related study or get back to the bench with your selected gene sets)