How to determine the best assembly with number of N in the sequence and how to find Telomere and centromere markers
1
0
Entering edit mode
2.8 years ago
Théo • 0

Hello,

I have 3 fasta files of fungi genome assembly of 3 different assembler tools and in my fasta files there are some N characters wich represent the lot of Transposable elements in my genomes.

And i wanted to choose the best assembly beetween the 3 files compared to the N.

Is there a rule for N's that says that the one with the least N's is the best ?

There is a cutoff value for the N ?

I have also an other question : What is the way to identify centromer and telomer if they are masked because all repeted regions are N ?

Do i need to check about Repeat Maskers options?

Thanks for your answers.

centromere assembly telomere genome • 607 views
ADD COMMENT
0
Entering edit mode
2.8 years ago
liorglic ★ 1.5k

There are several measures for the quality of an assembly, e.g. contiguity (N50, N90), total assembly size, BUSCO score, and also the % of Ns in the assembly. If all other stats are similar, then generally the assembly with lowest % Ns should be favored.
You can try a software like QUAST that will calculate many assembly stats for you, so you can easily compare your results.

ADD COMMENT

Login before adding your answer.

Traffic: 1624 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6