Minia tutorial in the Biostar Handbook
1
0
Entering edit mode
3.5 years ago
damonlbp ▴ 20

Hi all,

So I'm slowly making my way through the biostars handbook for sequence assembly, specifically The Lucky Bioinformaticians Guide.

And I've come across an issue with minia. The default command in the handbook will not work for me:

minia -in $READS -out genome

It requires more arguments which means I ended up with this:

minia SRR1553425_1.fastq -1 10 10200000 genome

The problem is I now get this error:

(base) me@me-ubuntu:~/biostars/lucky$ minia SRR1553425_1.fastq -1 10 10200000 genome
estimated values: nbits Bloom 0, nb FP 148021, max memory 1 MB
taille cell 16 
Available disk space in /home/dlbp/biostars/lucky: 54007 MB
Sequentially counting ~78 MB of kmers with 58 partition(s) and 4 passes using 1 thread(s), ~1 MB of memory and ~26 MB of disk space
| First step: Converting input file into Binary format                                               |
[-----------------------------------------------------------------------------------------------------]
| Counting kmers                                                                                     |
100    %     elapsed:      0 min 1     sec      estimated remaining:      0 min 0     sec 

Saved 96676 solid kmers
-------------------Counted kmers time Wallclock  0.980047 s

------------------ Counted kmers and kept those with abundance >=10,     
0 is not a valid value for number of hash funcs, should be in [1-10], resuming wild old value 4
 Writing positive Bloom Kmers 90000
773408 kmers written
-------------------Write all positive kmers time Wallclock  0.058599 s
1 partitions will be needed
Build Hash table 50000End of debloom partition  52429 / 52428 

665206 false positives written , partition 0 
Build Hash table 90000Total nb false positives stored in the Debloom hashtable 573824 
-------------------Debloom time Wallclock  0.105797 s
0 is not a valid value for number of hash funcs, should be in [1-10], resuming wild old value 4
Insert solid Kmers in Bloom 900000 is not a valid value for number of hash funcs, should be in [1-10], resuming wild old value 4
0 is not a valid value for number of hash funcs, should be in [1-10], resuming wild old value 4
0 is not a valid value for number of hash funcs, should be in [1-10], resuming wild old value 4
Floating point exception (core dumped)

Now I can't find any fix for this what so ever and quite frankly I am not experienced enough to understand how to fix this myself. Would anyone how to even begin?

biostar-handbook minia • 1.2k views
ADD COMMENT
0
Entering edit mode

How much memory do you have on this machine? I don't recall if Biostars handbook explicitly notes minimum hardware configuration needed to complete the exercises in the book but you may not have enough memory would be my first guess.

ADD REPLY
0
Entering edit mode

I have 32GB but seeing as the files are 27MB each i would have thought that would be more than enough. I've just tried 100MB, 1GB and 10GB of mem with no better a result.

I also can't find anything about minimum hardware requirements either.

ADD REPLY
0
Entering edit mode
3.5 years ago

The error indicates some sort of internal software error. Probably not something we can debug here.

That assembly chapter is also ripe for a rewrite and might happen this year.

Today I would recommend the megahit assembler:

https://github.com/voutcn/megahit

it will usually do a much better job, and with low resource usage

conda install megahit

Then, to be more in tune with our times, use the SRR number SRR10971381 to reproduce the original discovery of the SARS-COV-2 virus as described in:

then:

# Get the data
fastq-dump SRR10971381 --split-files  --origfmt --outdir reads

# Trim by quality
R1=reads/SRR10971381_1.fastq
R2=reads/SRR10971381_2.fastq
trimmomatic PE $R1 $R2 -baseout reads/read.fq SLIDINGWINDOW:4:30

# Assemble the reads
RT1=reads/read_1P.fq
RT2=reads/read_2P.fq

# Run the megahit assembler.
megahit -1 $RT1 -2 $RT2 -o out
ADD COMMENT

Login before adding your answer.

Traffic: 2391 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6