Repeated testing/data mining in RNA Seq
0
0
Entering edit mode
7 months ago

Hi all,

I have got an RNA Seq data set comprising samples from a cohort of patients with rare disease (and controls). The clinical presentation among these patients is varied and there are many genetic subtypes of the disease within my cohort.

In addition, to my regular case/control analysis, my supervisor has provided me with 6+ additional tests to run. Mainly these tests involve comparing different subtypes of the disease.

However, I am aware that if you run enough tests you will eventually find something interesting. As such, I am wondering what the protocol is here? Is this fine? Or is it entering into p hacking territory?

Many thanks,

Rob.

repeated RNA-Seq mining testing data • 368 views
ADD COMMENT
1
Entering edit mode

As such, I am wondering what the protocol is here?

My personal suggestion is to just do it and see what comes out. Multiple comparisons are often necessary to build a hypothesis. If you do overly stringent correction for many comparisons you might lose potentially interesting aspects. RNA-seq findings are usually the start for downstream analysis which then should confirm the finding. I would not be so worried.

ADD REPLY
0
Entering edit mode

Thanks! I'll give it a go.

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6