Entering edit mode
5 months ago
tubs
•
0
Hello, I have an assembly with a big size (3.8MBP) than expected (2.3-2.8). There is a high number of contigs (1119) than accepted (500). seems to be contamination. Is there way to remove other contaminating contigs or should we discard the assembly. any recommendations.
A quick way you can check for contamination is by aligning the assembly to other genomes (as it seems you have some available) and/or BLAST the contigs against NCBI, particularly those that do not align in the first step.
You can also use a metagenomic tool like Kraken2 with one of their many databases to try and identify contamination on your adapter trimmed reads. Also, a high number of contigs could just mean you don't have reads long enough to span repetitive or complex genomic elements.
What was your sequencing strategy?