Entering edit mode
7.6 years ago
mms140130
▴
60
Hi,
I need to visualize the distribution of 20,000 genes for 1000 patients in gene expression data to see how far it is from Normal distribution assumption , to perform an eQTL analysis to see if the transformation of log is meaningful ( no bimodal plots)
Can anyone suggest how to visualize the distribution of 20,000 genes
Why not just use plot() for each gene?
is this an efficient way to see all plots at once , so I have 20,000 plot ??!
I don't know if you're better off heatmapping a quarter million data points though, especially if you're not scaling the data for each gene. Maybe visual examination is out of the question. Instead iterate through each gene, and do a shapiro test for each to assess normality:
Combine these pvalues into a list for all of your genes, and sort by pvalue. Anything significant is a normal distribution.