Statistics analysis of assembled genome from PacBio HiFi reads using Hifiasm
2
0
Entering edit mode
3.5 years ago
K ▴ 10

Hi,

I had consensus reads from a PacBio sequencing. I used the assembler hifiasm to create an assembly on these CCS. I got five .gfa files. I have transformed .gfa file to .fasta contigs using bandage and awk command.

What can be used to get more statistics on the assembly?

ccs gfa pacbio contigs hifiasm • 3.7k views
ADD COMMENT
1
Entering edit mode

Quast and BUSCO would be my initial suggestions, but it really depends on what specific stats you're looking to compute.

ADD REPLY
0
Entering edit mode

bandage gave me N50, total length but I am particularly looking for coverage and depth of the assembly.

ADD REPLY
0
Entering edit mode

To get that, you should maps the reads back to the assembly (e.g., with minimap2 for long reads) and then use a tool like samtools or mosdepth to get the depth from the sam/bam file.

ADD REPLY
1
Entering edit mode
3.5 years ago
Billy Rowell ▴ 330

You can get N50/NG50 from https://github.com/lh3/calN50.

ADD COMMENT
1
Entering edit mode
3.5 years ago
gconcepcion ▴ 410

This py3 script will get you basic fasta stats: https://github.com/PacificBiosciences/pb-assembly/blob/master/scripts/get_asm_stats.py

ADD COMMENT
0
Entering edit mode

this is really great. I am getting the same output as in bandage. However, how can I get the coverage and depth of the assembly?

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6