First of all I am not a statistician and my comment reflects my amateur understanding of statistics: For the technical replicates I would take the mean value as technical replicates are intended to check for pipetting and measurement errors. If the standard deviation is reasonable, average the result and use this value for the further statistics. For the sample size, the significances of Wilcoxon tests are limited by the replicate number. That means that even if all values of group 1 were smaller than all values of group 2, with n=3 there is a minimal p-value you can get with this experimental setup.
E.g. in R:
wilcox.test(x = c(1,2,3), y=c(10,11,12))
Wilcoxon rank sum test
data: c(1, 2, 3) and c(10, 11, 12)
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0
gives the same result as
wilcox.test(x = c(1,2,3), y=c(100,111,122))
Wilcoxon rank sum test
data: c(1, 2, 3) and c(100, 111, 122)
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0
If you want a smaller p
you have to increase the replicate number. Use a power test to calculate the necessary replicate number at a given variance. For n=4 in one condition it would e.g. be
wilcox.test(x = c(1,2,3), y=c(100,111,122, 133))
Wilcoxon rank sum test
data: c(1, 2, 3) and c(100, 111, 122, 133)
W = 0, p-value = 0.05714
alternative hypothesis: true location shift is not equal to 0
..and so on.
As in your case I would use an unpaired Wilcoxon (Mann-Whitney U) test as from what I understand your samples are independent of each other. Paired designs would be to measure the gene e.g. in an aliquot of an inducible cell line at time 0h and measure again after stimulating the remaining aliquot of same cell line for 24h. In your case you have independent cells so measurements do not influence each other.
Thank you very much, and one last question, when I graph my data, I saw differences, the problem is that when I do the nonparametric test as you said I don't see differences, however, when I do a parametric test I do. I'm not sure if I will be allowed to do a 4th or 5th replicate since I have several targets and samples, so I don't know if, is it correct/feasible to assume normality and homogeneity and go for the parametric test?
I do not know if this would statistically be correct but in the literature people use t-tests all the time for qPCR data so you probably can come away with it.
I've seen a Welch's t-test used pretty consistently in the literature for qPCRs, my understanding is that it's more robust than a Student's t-test. Specifically, my PhD lab would perform this test on the ddCT values rather than the fold changes. The idea being that these are the directly measured values and already exist in log-space (the ddCT is actually the equivalent of the log2FC).
I'll also include the disclaimer that I'm not a statistician and took that advice from other lab members at the time.