kmergenie generates histo files but no histograms!
1
0
Entering edit mode
10.1 years ago
abujamel.t • 0

I am using kmergenie version 1.6473 to estimate the optimal kmer for my assembly. The program runs perfect, but my problem is that it only generates .histo files but no histograms.

Here is the command I used:

kmergenie unligned_R1.fastq unligned_R2.fastq -o kmer -s 2 -l 13 -t 32

Linear estimation: ~51 M distinct 71-mers are in the reads
K-mer sampling: 1/15
| processing                                                                                         |
[going to estimate histograms for values of k: 121 119 117 115 113 111 109 107 105 103 101 99 97 95 93 91 89 87 85 83 81 79 77 75 73 71 69 67 65 63 61 59 57 55 53 51 49 47 45 43 41 39 37 35 33 31 29 27 25 23 21 19 17 15 13
-----------------------------------------------------------------------------------------------------------------------Total time Wallclock  628.354 s

The result was only a list of histo files from kmer-k13.histo to kmer-k121.histo BUT no graphs!!!

Could anyone help me solve this problem?

Cheers,
TJ

KmerGenie histogram • 5.4k views
ADD COMMENT
0
Entering edit mode

Hi, can you post the list of files that it generated, as well as the first few lines of one of the .histo files?

Also, did it output any error in stderr?

ADD REPLY
0
Entering edit mode

Hi Rayan,

the files list is as following:

kmer-k13.histo
kmer-k15.histo
kmer-k17.histo
kmer-k19.histo
kmer-k21.histo
kmer-k23.histo
kmer-k25.histo
kmer-k27.histo
kmer-k29.histo
kmer-k31.histo
kmer-k33.histo
kmer-k35.histo
kmer-k37.histo
kmer-k39.histo
kmer-k41.histo
kmer-k43.histo
kmer-k45.histo
kmer-k47.histo
kmer-k49.histo
kmer-k51.histo
kmer-k53.histo
kmer-k55.histo
kmer-k57.histo
kmer-k59.histo
kmer-k61.histo
kmer-k63.histo
kmer-k65.histo
kmer-k67.histo
kmer-k69.histo
kmer-k71.histo
kmer-k73.histo
kmer-k75.histo
kmer-k77.histo
kmer-k79.histo
kmer-k81.histo
kmer-k83.histo
kmer-k85.histo
kmer-k87.histo
kmer-k89.histo
kmer-k91.histo
kmer-k93.histo
kmer-k95.histo
kmer-k97.histo
kmer-k99.histo
kmer-k101.histo
kmer-k103.histo
kmer-k105.histo
kmer-k107.histo
kmer-k109.histo
kmer-k111.histo
kmer-k113.histo
kmer-k115.histo
kmer-k117.histo
kmer-k119.histo
kmer-k121.histo

kmer-k121.histo shows:

1    1863
2    0
3    0
4    0
5    0
6    0
7    0
8    0
9    0
10    0
11    0
12    0
13    0
14    0
15    0
16    0
17    0

there was no error

TJ

ADD REPLY
0
Entering edit mode

Thanks. Is this an actual sequencing dataset? There shouldn't be so few kmers that appear once, twice, 3 times, etc.. (that's the meaning of the hist files)

Or is this a simulated dataset? Kmergenie only works with real data, or simulated according to realistic sequencing scenario

ADD REPLY
0
Entering edit mode

it is an actual sequencing dataset of metagenomic sequencing generated from Illumina HiSeq2500 sequencing

TJ

ADD REPLY
0
Entering edit mode

Thanks. Histogram generation should work on that data. (Note, however, that Kmergenie is not designed for metagenomic data).

Let's see if the fault comes from your system or from that specific dataset. Could you try a simple (small, for quick execution) genomic dataset, using the same parameters -o kmer -s 2 -l 13 -t 32, and see if kmergenie completes successfully on your system?

ADD REPLY
0
Entering edit mode

Thanks Rayan for your reply,

I tried sample paird end fastq files with 750K reads and 36 reads length. Same problem.

TJ

ADD REPLY
0
Entering edit mode

Alright, so something is wrong with either the system or the software. I checked that command line, it works fine on my computer on a sample dataset.

What's the output of the "make check" command in kmergenie folder, does it print an error?

What's your system? (desktop, cluster?)

Could you try running kmergenie without any option except -o kmer on that sample data?

ADD REPLY
0
Entering edit mode

This is the output of make check

scripts/test_install
Testing presence of specialk....
OK
Testing presence of Rscript....
R scripting front-end version 3.1.2 (2014-10-31)
OK
Testing basic Rscript functionality....
Rscript --no-init-file -e 'rnorm(1)'
[1] "rnorm(1)"
OK
Testing a simple KmerGenie example....
initial estimate of genomic kmers gaussian mean, sd, error proportion, shape: 3 1.4826 0.9596929 0
p$u.v: 4.114784
abundance    ratio_of_erroneous_over_correct_kmers
1   23228317
2   0.000002223243
3   0.000000007846881
4   0.0000000008400604
cutoff: 1
sum probs good 1.001097cutoff 1
non-repeated genomic distinct kmers:  42
repeated genomic distinct kmers:  0
sum of absolute differences of fit: 2.145231
42
Test successful if the number 42 was printed the line above. KmerGenie is ready, type `./kmergenie`.

My system is a Dell workstation running Biolinux 8 (Ubuntu 14.04)

I tried running the command as you specified, same thing happened but it tested only two kmers: 21 and 31 (only histo files no graphs)

TJ

ADD REPLY
0
Entering edit mode

Thanks for this information. Your install looks normal, I'm getting puzzled.

One more debug I can think of, what's the output of the following command, executed in the folder where your .histo files reside, (replace XXX by the path to kmergenie folder):

XXX/scripts/decide kmer

(kmer is the string given to the -o parameter)

This should run the second phase of Kmergenie manually and print some debug information in stdout.

ADD REPLY
0
Entering edit mode

Hi Rayan,

it worked! it generated the graphs from the histo files.

So what do you think the issue is?

TJ

ADD REPLY
0
Entering edit mode

For some reason the "decide" script doesn't seen to be getting executed. That is very strange, as the histogram creation program is correctly executed; both programs are called the same way in the kmergenie main program. To be honest, I do not know why this happens, and havn't seen that with any other user. Can you please paste the full output of the decide command, maybe I'll find something unusual?

ADD REPLY
0
Entering edit mode

Was this ever resolved? I have the same issue, thanks.

ADD REPLY
0
Entering edit mode

I haven't heard back from TJ. Could you please execute the following command, and paste the full output, so that I may gain some insight into the problem?

[path_to_kmergenie]scripts/decide [prefix]

where [prefix] is the string given to the -o parameter.

ADD REPLY
0
Entering edit mode

I get a list of histo files but no pdfs when I disown kmergenie and log out (also does not proceed into the second round). When I disown only and keep the terminal alive, it runs fine. Don't know if this might provide any insight.

Cheers, John

ADD REPLY
0
Entering edit mode
9.5 years ago

Hi Rayan! You are very attentive, Thank you.

I got a similar issue. The program runned ok, made .histo, pdfs and everything but could not predict best k mer.

So, I've ran ./decide as you said and it worked. Interesting!

The message:

could not fit kmerGenie.histograms-k91.histo
table of predicted num. of genomic k-mers: kmerGenie.histograms.dat
recommended coverage cut-off for best k: 1
best k: 61

That's it! Thank you!

ADD COMMENT

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6