Dear all,
I am trying to sort my gff file as ascending order by chromosome (chr1, chr2. chr3.....chrX) but not able to succeeded. Neither sortBed nor unix sort produce a karyotype order (chr1, chr2, ... chr10, chr11, chrM, chrX). However, I found one possible solution by sort -V -k1,1 (this works fine in my another system (centOS; sort version: 8.22) ) but unfortunately my main system (RedHat; sort version:5.97) sort do not have option -V
. Any possible alternative ???
Note: please keep in mind that i m not sorting my gff file as typical bed file (sort -k 1,1 -k2,2n
) as this is not typical bed file.
My input.gff looks like:
chr1 . miRNA_primary_transcript 451141 451218 . + .
chr1 . miRNA_primary_transcript 1275348 1275428 . + .
chr1 . miRNA_primary_transcript 2806071 2806208 . + .
chr10 . miRNA_primary_transcript 4333896 4333977 . - .
chr10 . miRNA_primary_transcript 10295360 10295450 .
chr10 . miRNA_primary_transcript 15983153 15983233 .
chr11 . miRNA_primary_transcript 2162553 2162662 . - .
chr11 . miRNA_primary_transcript 3157038 3157122 . + .
chr2 . miRNA_primary_transcript 59942577 59942660 .
chr20 . miRNA_primary_transcript 5116644 5116774 . + .
chr25 . miRNA_primary_transcript 35855072 35855176 .
chr3 . miRNA_primary_transcript 13208734 13208831 .
......
Total number of chromosome 25.
see How To Sort Bed Format File
Those answers won't work for GFF; as is because you have declarations and comments in the header, and most GFF files will have comment lines separating each feature. You could could skip all those lines, but that would create something that would not be a usable GFF for most purposes.