Circular scaffolds in a genome assembly
2
0
Entering edit mode
23 months ago
Shriram • 0

Hi!

When I assemble PacBio HiFi reads of a Eukaryotic linear genome through HiFiasm, I get a couple of circular scaffolds. The linear scaffolds are appended with an l in the fasta header, and the circular scaffolds are appended with a c in the fasta header.

On what basis is l and c assigned to these scaffolds?

Thanks!

Assembly Pacbio Genome HiFi HiFiasm • 984 views
ADD COMMENT
3
Entering edit mode
23 months ago

Quite possible that you have assembled your organelle genome as well. You can verify that by doing a homologous search. When the assembler assembles sequence and finds support of reads to be able to circularise by overlapping the ends it adds _c in the fasta header. Not always the circular sequences are the organelles, the repeat region of your genome also comes as circular sequences because of possibility of overlaps end to end because of repeat nature.

ADD COMMENT
1
Entering edit mode
23 months ago
gconcepcion ▴ 410

To elaborate slightly on bagdevi.mishra's reply which I agree with, try taking one or more of the circular sequences and BLAST'ing them (or part of them if they are too large) to see what sort of hits come up in NCBI's database: https://blast.ncbi.nlm.nih.gov/Blast.cgi

ADD COMMENT

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6