HI,
I have generated some genome assemblies using Nanopore data with FLYE assembler.
I have also performed 3x rounds of polishing with Racon (using nanopore data) and 5x rounds with Pilon (using illumina data)
Now t have looked at my fasta files and they are not sorted (as longest to shortest contigs) and the contig numbering is also not in numerical order.
>contig_1
>contig_102
>contig_103
>contig_104
>contig_105
>contig_106
>contig_107
>contig_11
>contig_110
>contig_111
>contig_112
>contig_113
>contig_114
>contig_117
>contig_120
>contig_121
>contig_122
>contig_124
>contig_125
>contig_128
My Questions
- Do I need to perform this contig renaming and sorting at the first step where i get the genome assemblt fasta files?
- Or I should sort the final polished assembly (longest to shortest) and rename them ?
- Is their any specific way to raname contigs ?
If you look, your contigs do appear to be sorted by numerical value, its just not a "natural" sort because the numbers are not to the same significant figures or zero-padded.
As genomax points out though, this rarely matters.
Hi, I understand your point, but if you see the above shared answern this is what i mean.