Hello everyone.
I have downloaded brain cortex normalized gene expression data from gtex portal. There are total 17609 genes whose expression is available. I want to specifically select those genes whose expression are significant in all the brain cortex and ignore other in my further TWAS study. Is there any way to do this?
Thank you.
This is how my dataset look like:
dput(expr_df[1:5,1:5])
structure(list(ENSG00000227232 = c(2.06593146369747, 0.592675922123552,
-0.109732708742409, 0.651749000024809, 0.636774336760061), ENSG00000268903 = c(1.19264174697961,
-0.507718909302338, -0.493924893351373, -1.327128982828, -0.347580384376274
), ENSG00000269981 = c(-0.347580384376274, -0.283623365386845,
-0.195936083375207, -0.794018835723402, 0.309052549457822), ENSG00000241860 = c(-1.05535688017356,
0.480224227652705, 1.14447383125234, 0.744935555245567, -0.89847561689514
), ENSG00000279457 = c(0.33468319748728, -0.493924893351373,
0.258376301003193, 0.507718909302338, 0.549701290113435)), row.names = c("GTEX-1117F",
"GTEX-111FC", "GTEX-1128S", "GTEX-117XS", "GTEX-1192X"), class = "data.frame")
Normalized how?
I got the normalized gene count from gtex portal. they use inverse quantile normalization
Can you explain the normalization or give me a link that describes it, please? Comparison across samples will always be caveated as RNA-seq yields relative measures.