Question

Questions about the Getting-started of wtdbg2

0

Entering edit mode

4.1 years ago

boymin2020 ▴ 80

Hi,

This is my first time to assemble long reads from nanopore sequencing. I also have the short reads generated by Illumina sequencer. Here is my plan, to use wtdbg2 to get the draft genome fasta file, then to use pilon to polish. However, I have been blocked at the getting-started part of wtdbg2. I am totally confused by the input and output files in the following command lines. Are they just in one pipeline or just independent examples?

#quick start with wtdbg2.pl
./wtdbg2.pl -t 16 -x rs -g 4.6m -o dbg reads.fa.gz

# Step by step commandlines

# assemble long reads
./wtdbg2 -x rs -g 4.6m -i reads.fa.gz -t 16 -fo dbg

# derive consensus
./wtpoa-cns -t 16 -i dbg.ctg.lay.gz -fo dbg.raw.fa

wtdbg2 nanopore Illumina • 1.8k views

ADD COMMENT • link updated 4.1 years ago by h.mon 35k • written 4.1 years ago by boymin2020 ▴ 80

0

Entering edit mode

reads.fa.gz is the input sequence file. Substitute with your own.

dbg.raw.fa would be the final consensus fasta file.

ADD REPLY • link 4.1 years ago by GenoMax 147k

score 1 · Answer 1 · 2020-10-28

1

Entering edit mode

4.1 years ago

h.mon 35k

The wtdbg2.pl is a Perl script that wraps the whole wtdbg2pipeline in one command. As such, it assemble the reads (with wtdbg2), derive the consensus (with wtpoa-cns), map (with minimap2) and filter (with samtools) the reads back to the consensus, to obtain a polished assembly (again, with wtpoa-cns).

The two commands bellow the wtdbg2.pl (wtdbg2 and wtpoa-cns) correspond to the first two steps of the Perl pipeline.

So you can run the perl script, and be done with it, or run each command separately.