I am working with paired end metagenomic illumina data.
I need to know that on providing list file with multiple sample entries does kmergenie decide kmer for all the samples provide in the list or how it treats the sample list?
I means for metagenomic assembly is feasible to have one kmer for all samples or do you recommend to run kmergenie with all the samples separately.
Kmergenie most likely won't work for metagenomic data, I'm sorry. It expects a single genome. Hence, if your data contains more than one genome, it is quite likely that it won't work. (the model won't be able to fit the kmer histograms that have been generated.)
If you have separate samples and each sample contains a single genome, then it is recommended to run kmergenie on samples separately. However, if each sample contains multiple genomes, then kmergenie most likely won't work.
The good news is that for metagenomics assembly, you could try multi-k assemblers, where you do not need to specify a best k value: SPAdes or Megahit.
Dear Rayan,
Thanks Yes i tested it with Spaded it does have metagenomic option optimizing to the best kmer.