Hi, i was trying to assemble paired end read using idba-hybrid algorithm using different kmers to get best possible assembly. I have used kmer 39, 41, 43 ......69, 71 and got following result-
reads 56349906 read_length 82
kmer 39 kmers 230430013 228825171 merge bubble 103765 contigs: 378032 n50: 544 max: 35388 mean: 304 total length: 115185374 n80: 251 aligned 35968763 reads confirmed bases: 82111516 correct reads: 23033966 bases: 2268606 distance mean 209.256 sd 98.3785 seed contigs 165543 local contigs 756064
kmer 41 kmers 125484914 125786216 merge bubble 20738 contigs: 335061 n50: 791 max: 48059 mean: 360 total length: 120948997 n80: 295 aligned 36920244 reads confirmed bases: 87217874 correct reads: 23325239 bases: 101875 distance mean 211.548 sd 99.5509 seed contigs 140395 local contigs 670122
kmer 43 kmers 124302320 124588795 merge bubble 15703 contigs: 307517 n50: 884 max: 47745 mean: 396 total length: 121913541 n80: 316 aligned 37380266 reads confirmed bases: 89122603 correct reads: 23546996 bases: 57649 distance mean 212.215 sd 100.013 seed contigs 135917 local contigs 615034
kmer 45 kmers 124052526 124313720 merge bubble 13368 contigs: 290649 n50: 941 max: 53489 mean: 421 total length: 122434083 n80: 329 aligned 37615762 reads confirmed bases: 90212060 correct reads: 23691049 bases: 44094 distance mean 212.617 sd 100.311 seed contigs 133189 local contigs 581298
kmer 47 kmers 123642915 123884380 merge bubble 11896 contigs: 274412 n50: 983 max: 75436 mean: 446 total length: 122478930 n80: 342 aligned 37904249 reads confirmed bases: 91039014 correct reads: 23802925 bases: 32467 distance mean 212.823 sd 100.445 seed contigs 132567 local contigs 548824
kmer 49 kmers 123353786 123569290 merge bubble 11769 contigs: 255496 n50: 1047 max: 50123 mean: 478 total length: 122342578 n80: 360 aligned 38164275 reads confirmed bases: 91905459 correct reads: 23934796 bases: 34555 distance mean 213.294 sd 100.77 seed contigs 130357 local contigs 510992
kmer 51 kmers 122897595 123088511 merge bubble 10828 contigs: 240514 n50: 1114 max: 52424 mean: 508 total length: 122307272 n80: 374 aligned 38330692 reads confirmed bases: 92599279 correct reads: 24018592 bases: 29058 distance mean 213.644 sd 101.012 seed contigs 128075 local contigs 481028
kmer 53 kmers 122221584 122390550 merge bubble 9812 contigs: 229071 n50: 1152 max: 57374 mean: 533 total length: 122241704 n80: 387 aligned 38499534 reads confirmed bases: 93099171 correct reads: 24096322 bases: 21692 distance mean 213.855 sd 101.168 seed contigs 127434 local contigs 458142
kmer 55 kmers 121844000 121992660 merge bubble 9726 contigs: 217178 n50: 1208 max: 57519 mean: 562 total length: 122176218 n80: 401 aligned 38564850 reads confirmed bases: 93579451 correct reads: 24139532 bases: 22191 distance mean 214.138 sd 101.37 seed contigs 125654 local contigs 434356
kmer 57 kmers 121263301 121391861 merge bubble 9273 contigs: 208770 n50: 1256 max: 75446 mean: 585 total length: 122239232 n80: 411 aligned 38585027 reads confirmed bases: 93924244 correct reads: 24150729 bases: 17707 distance mean 214.321 sd 101.527 seed contigs 124056 local contigs 417540
kmer 59 kmers 120618126 120729076 merge bubble 8522 contigs: 199615 n50: 1296 max: 75448 mean: 611 total length: 122082802 n80: 423 aligned 38675347 reads confirmed bases: 94265540 correct reads: 24193520 bases: 16468 distance mean 214.441 sd 101.597 seed contigs 123336 local contigs 399230
kmer 61 kmers 120050803 120143416 merge bubble 8425 contigs: 189862 n50: 1351 max: 75450 mean: 642 total length: 121903613 n80: 438 aligned 38718793 reads confirmed bases: 94562618 correct reads: 24213798 bases: 16256 distance mean 214.654 sd 101.766 seed contigs 121599 local contigs 379724
kmer 63 kmers 119331443 119406048 merge bubble 8131 contigs: 181499 n50: 1402 max: 75452 mean: 670 total length: 121740686 n80: 450 aligned 38717993 reads confirmed bases: 94797600 correct reads: 24229825 bases: 14843 distance mean 214.813 sd 101.905 seed contigs 120044 local contigs 362998
kmer 65 kmers 118592377 118651701 merge bubble 7666 contigs: 173753 n50: 1446 max: 75454 mean: 699 total length: 121517743 n80: 463 aligned 38722383 reads confirmed bases: 94991466 correct reads: 24239760 bases: 12668 distance mean 214.92 sd 101.961 seed contigs 118948 local contigs 347506
kmer 67 kmers 117847057 117890571 merge bubble 7600 contigs: 164899 n50: 1509 max: 75456 mean: 734 total length: 121193575 n80: 478 aligned 38705550 reads confirmed bases: 95163219 correct reads: 24255762 bases: 11984 distance mean 215.142 sd 102.124 seed contigs 117002 local contigs 329798
kmer 69 kmers 117033262 117060903 merge bubble 7314 contigs: 156988 n50: 1571 max: 75458 mean: 769 total length: 120856345 n80: 493 aligned 38704024 reads confirmed bases: 95296193 correct reads: 24270705 bases: 11324 distance mean 215.318 sd 102.256 seed contigs 115062 local contigs 313976
kmer 71 kmers 116074965 116089938 merge bubble 6922 contigs: 140946 n50: 1683 max: 75460 mean: 848 total length: 119579181 n80: 525 reads 56349906 aligned 38852447 reads distance mean 215.596 sd 102.385 expected coverage 0.362628 edgs 10525 contigs: 133449 n50: 1989 max: 140103 mean: 889 total length: 118728461 n80: 537
So as you can see as the kmer increase n50 increases and contigs decreases. I am new in this area, can you suggest which kmer should i use, hope you understand my question. Best SG
I think it is worth reading the paper describing the algorithm to understand this better.