Therefore these two sample size are largely different and directly applied Mann-Whitney U test to them is of course no significant difference.
I don't think the lack of significant difference is due to the different sample sizes, why should it be the case? (In fact I don't see the point in resampling SNPs from the large set). Rather, the small dataset reduces power so much that the difference you see is non significant.
How do I know that the non-significant result is due to small sample size or due to that these two samples are truly no difference?
These are two sides of the same coin. The difference you observe is not significant because the sample size is not large enough. With huge sample sizes even tiny differences would produce very small p-values, in that case the question would be "Is this difference biologically meaningful?"
This is to illustrate the point. Produce two sets differing by small amount. The p-value for the difference is highly significant since the sample sizes are large. If you downsample one set the difference is no longer significant:
set.seed(1)
set1<- rbeta(n= 10000, 10, 10)
set.seed(2)
set2<- rbeta(n= 10000, 10, 9.5)
Difference between set 1 and 2 is significant even if the difference is small:
mean(set1); mean(set2)
[1] 0.5006102
[1] 0.5112549
wilcox.test(set1, set2)
# p-value = 1.023e-11
# Now reduce one set to 14 obs:
set.seed(3)
wilcox.test(sample(set1, size= 14), set2)
# p-value = 0.4413
Thank you, I'll try it.