Hi, I'm trying to simulate the 22 chromosomes with sim1000G package in R using the for loop the problem is that for the first run of the loop, the code works well but after it cant produce the genetics map.
This is my code
for (i in c(1:2)) {
print("#####")
print(i)
print("######")
vcf_file = file.path("chr_22_final2.vcf")
vcf = readVCF( vcf_file , maxNumberOfVariants = 2000 , min_maf = 0.02 , max_maf = NA )
startSimulation(vcf, totalNumberOfIndividuals = 1200)
ids = generateUnrelatedIndividuals(1000)
genotype = retrieveGenotypes(ids)
E_vector=genotype[,i+300]
#E_vector[E_vector==2]=1
#genotype[genotype==2]=1
result_matrix=genotype[,-(i+300)]
data_after_sim_funce=data.gen(sample_size=900, p=1999,
n_g_non_zero=200, n_gxe_non_zero=20,
family="binomial", mode="strong_hierarchical")
tune_model = gesso.cv(data_after_sim_funce$G_train, data_after_sim_funce$E_train, data_after_sim_funce$Y_train,
grid_size=20, tolerance=1e-4,
parallel=TRUE, nfold=4,
normalize=FALSE, normalize_response=FALSE,
seed=1)
gxe_coefficients = gesso.coef(tune_model$fit, tune_model$lambda_min)$beta_gxe
g_coefficients = gesso.coef(tune_model$fit, tune_model$lambda_min)$beta_g
print(length(g_coefficients[g_coefficients>0]))
print(length(gxe_coefficients[gxe_coefficients>0]))
num_main=length(g_coefficients[g_coefficients>0])
num_iter=length(gxe_coefficients[gxe_coefficients>0])
num_main_large_zero[i]=num_main
num_iter_large_zero[i]=num_iter
}
for index i =1 it works correctly but when the loop moves to i=2 it returns an error : [#####...] Creating SIM object [#####...] Haplodata object created Error in startSimulation(vcf, totalNumberOfIndividuals = 1200) : Error: Genetic map does not match the region being simulated
I can't seem to understand what is happening because the code worked perfectly well for i=1 but crashes on i=2. I would be happy for your opinion :) should i delete after every iteration the "old" SIM object ?