Should zero expression values be remove in eQTL analysis?
1
0
Entering edit mode
5.9 years ago
whtopjazz • 0

For example, RNA-seq expression for gene1 in 10 people are, GENE1=[0, 0, 1, 2, 3, 4, 3, 4, 2, 7]。 SNP1 with alleles A and G, and SNP1 in 10 people is SNP1=[0, 1, 2, 1, 1, 2, 2, 2, 0, 0], 0 means GG, 1 means AG, 2 means AA。

What I want to do it eQTL analysis. Simple put, I want to fit a linear model to find out if the expression GENE1 was regulated by SNP1。 Should I remove the zeros values in GENE1 expression values before fit the regression model? It should be noted, for many genes, if I removed the zeros, most of the samples will also be removed.

RNA-Seq SNP • 1.1k views
ADD COMMENT
1
Entering edit mode
5.9 years ago
Fabio Marroni ★ 3.0k

No.

I would only remove genes that have 0 levels of expression in a very large proportion of samples.

More generally, you might want to filter genes with low variance across samples (see e.g. this paper, in the eQTL mapping paragraph of Materials and Methods section), since they are not informative for the analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 1933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6