Should missing values in one condition but highly significant in other stay or go away? [NGS]
0
1
Entering edit mode
8.8 years ago

I am sure, its discussed here and there and in some posts and comments, but I am writing up as a new question.

What is consensus, when it comes to the missing data points (absent gene values) in one condition but highly significant in other.

Consider the following MA plot. I have setup several thresholds and in accordance to those, I label up the points with different colors. Now, if you look at the plot, the two diagonal lines (black-green and black-orange) which protrude out in opposite dimensions (going up and down, 45 deg) are the points which are quite significant in one condition but are missing the values in other.

These values are coming from a ChIP-Seq data and we also know that the missing data doesn't necessarily mean that the information from that gene is completely absent (biologically) but could be arising from experimental or computational issues.

So, simply put should these points (genes) stay with an explanation or must go away?

enter image description here

ChIP-Seq RNA-Seq R next-gen statistics • 1.5k views
ADD COMMENT
1
Entering edit mode

Until proved otherwise (independently and experimentally) they should stay since this is what you have in the data.

ADD REPLY
0
Entering edit mode

Right, still I have seen some people removing these and mentioning it in the text.

ADD REPLY
0
Entering edit mode

As long as you document what was taken out (not sure if there can be a valid "why") at least you will make the analysis reproducible for others.

ADD REPLY

Login before adding your answer.

Traffic: 1636 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6