I am doing a limma analysis of a data set comprising 4 groups with 50 samples in each. In total I am having 5 different comparisons: Group1 vs Group2; Group1 vs Group3 and so on... Limma gives me a set of differentially expressed genes for each comparison.
Next I want to do a leave-one-out cross-validation of the results for each group-comparison. In total 5 different LOOCV. In the LOOCV I am doing the feature selection with limma for each iteration. The problem I have is that I have to include only those groups I am comparing during the LOOCV, in total 100 samples for each LOOCV. Then the lemma-results will be different when the dataset inly has 100 samples, compared to 200 samples with the full dataset due to normalisation and filtering steps with be affected differently.
Is it correct to do the LOOCV with feature selection on only the 100 samples?
I don't understand why you want to do LOOCV. Cross validation is normally applied to classification methods, but limma does not do classification. Cross validation is intended to overcome bias introduced by training a classification method on the data, but limma doesn't have any such bias. What quantity in the limma results is it exactly that you are seeking to "validate".
The model is based on limma results, therefore I have to do a LOOCV. LOOCV is supposed to test the performance of a model, here my model is significant features from limma, that is, the feature selection is from limma.