Canu genome assembly error about low coverage
1
0
Entering edit mode
14 months ago
Yao ▴ 30

I am assembling a genome using Canu 2.2 with the following commands on a slurm cluster:

canu useGrid=false \
    -p tf -d tf-pacbio \
    genomeSize=164.4m \
    -corrected \
    corMhapSensitivity=normal \
    -pacbio SRR23272336_1_corrected.fastq

The input Pacbio fasta file is corrected by LoRDEC. The error of low coverage came up for 3 times and I don't know what to do, so I come for some help.

The genome report before trimming:

-- In sequence store './tf.seqStore':
--   Found 1411073 reads.
--   Found 23257214780 bases (141.46 times coverage).

The genome report after trimming:

-- In sequence store './tf.seqStore':
--   Found 27232 reads.
--   Found 51376451 bases (0.31 times coverage).

The error message:

-- ERROR:  Read coverage (0.31) lower than allowed.
-- ERROR:    stopOnLowCoverage = 10
-- ERROR:
-- ERROR:  This could be caused by an incorrect genomeSize or poor
-- ERROR:  quality reads that cound not be sufficiently corrected.
-- ERROR:
-- ERROR:  You can force Canu to continue by decreasing parameter
-- ERROR:  stopOnLowCoverage (and possibly minInputCoverage too).
-- ERROR:  Be warned that the quality of corrected reads and/or
-- ERROR:  contiguity of contigs will be poor.

Anyone get any ideas?

Pacbio assembly Canu Genome • 1.5k views
ADD COMMENT
0
Entering edit mode

Hi, everyone I am using this command for citrus limon

canu -useGrid=false genomeSize=38.2G -d limon_output -p limon -nanopore-raw SRR14842605.fastq

and canu is showing this error

ERROR: Read coverage (0.98) lower than allowed. -- ERROR: minInputCoverage = 10 -- ERROR: -- ERROR: This could be caused by an incorrect genomeSize. -- ERROR: -- ERROR: You can force Canu to continue by decreasing parameter -- ERROR: minInputCoverage. Be warned that the quality of corrected -- ERROR: reads and/or contiguity of contigs will be poor.

Please help me how to deal with this issue

ADD REPLY
0
Entering edit mode

The genome size you gave it is the wrong one; what is the expected genome size of your organism? Canu uses that number to calculate your coverage to throw away unnecessary data. You told Canu that your estimated genome size is 38 GBp which isn't true, 38 GBp is the coverage of SRR1484...

Your estimated genome size is 312 Mbp or 0.312 Gbp (Source). Use that number instead and it should run through.

ADD REPLY
0
Entering edit mode
14 months ago

LoRDEC trims and corrects already, but you rerun the trimming in Canu so it sounds like it overtrims. There's no need to trim twice. You could add the -trimmed flag in Canu to skip the trimming step.

ADD COMMENT
0
Entering edit mode

Thank you for your kindly reply! Here is the LoRDEC command I used:

lordec-correct -T 40 -i SRR23272336_1.fastq -2 SRR23272337_1.fastq,SRR23272337_2.fastq -k 19 -s 2 -o SRR23272336_1_corrected.fastq &> lordec_log.log

Just lordec-correct instead of lordec-trim, the program will carry on trimming too?

ADD REPLY
0
Entering edit mode

Oh I mean to change the Canu command to skip trimming, leave LoRDEC as it is:

canu useGrid=false \
    -p tf -d tf-pacbio \
    genomeSize=164.4m \
    -corrected \
    -trimmed \
    corMhapSensitivity=normal \
    -pacbio SRR23272336_1_corrected.fastq
ADD REPLY
0
Entering edit mode

I get your point, I would give it a try. Thanks for your time!

ADD REPLY

Login before adding your answer.

Traffic: 1794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6