Hello
I am trying to find optimal kmer value using kmergenie so as to use that kmer value in Metavelevet tool for metagenome assembly.
Q. 1) My first query is related to size of my left and right read's file after trimming sequences with Phred score<20. So, When I did trimming of such reads it reduced the orignal size of my read files as earlier my both left and right reads were 10,014 MiB but after using NGSqc toolkit trimming tool, My left read file was 9412 MiB and right reads file was 9288 MiB. So do it is any problem? why their file size is different as earlier they were same.
Q. 2) As I am having paired end reads of 102bp so I firstly calculated for left reads that gave me best k=25 and then I calculated it for right read and it gave me beat k=21. So what kmer value should I take to assemble these left and right reads?
Please guide me, I would be heartily thankful.
Best regards
I'd use a program like prinseq-lite.pl to make sure that reads are properly paired. Following command can do that, it will also remove read shorter than the length 10.
You can re-run kmergenie on the prinseq output files. Also you should check out other de novo assembly tools like SPAdes and IDBA-UD that can use multiple k-mer values for the de novo assembly.