the mixedsort is pretty good, but it will think ChrM comes before ChrX, then ChrY
if you have your own arbitrary order you should just use factors
> df<-data.frame("chr"=c("chr1","chrM","chr10","chr2","chrX","chr2"),"val"=c(1,2,3,4,5,6))
> df
chr val
1 chr1 1
2 chrM 2
3 chr10 3
4 chr2 4
5 chrX 5
6 chr2 6
> chrOrder<-c(paste("chr",1:22,sep=""),"chrX","chrY","chrM")
> df$chr<-factor(df$chr, levels=chrOrder)
> df$chr
[1] chr1 chrM chr10 chr2 chrX chr2
Levels: chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY chrM
> df[order(df$chr),]
chr val
1 chr1 1
4 chr2 4
6 chr2 6
3 chr10 3
5 chrX 5
2 chrM 2
thank you for pointing this out.
+1 for being in base R, and in the spirit of the R language. Also, this solves @zev.kronenberg issue of sorting:
df[order(df$chr, df$pos), ]
works as expected. because the factor is sorted based on the underlying integer values. Thelevels
argument tofactor()
is where the magic happens.