Entering edit mode
11.5 years ago
chris.mit7
▴
60
Here's a quick bash script to reorder a vcf file. It's currently setup to reorder a vcf to karotypic ordering. I've seen enough threads and have been annoyed at this myself enough times so hopefully someone else finds this useful.
grep -P "^#" $1 > ${1}.header
for I in {1..22}
do
grep -P "^$i\t" $1 > ${1}.chr$i
done
grep -P "^M\t" $1 > ${1}.chrM
grep -P "^X\t" $1 > ${1}.chrX
grep -P "^Y\t" $1 > ${1}.chrY
toCat=${1}.header
toCat="$toCat ${1}.chrM"
for I in {1..22}
do
toCat="$toCat ${1}.chr$i"
done
toCat="$toCat ${1}.chrX"
toCat="$toCat ${1}.chrY"
cat $toCat > ${1}.karo.vcf
Usage: bash script.sh filename.vcf
It'll make a new file, filename.vcf.karo.vcf
and leave the intermediates.
I have posted this before: alphanumeric GNU sort. It has a new option "-N" that sorts, for example, "chrM, chr10, chr2" to "chr2, chr10, chrM". For VCF sorting:
(grep ^# in.vcf; grep -v ^# in.vcf|sort -k1,1N -k2,2n) > out.vcf
. This works for any TAB delimited formats. It is more general, though for an already sorted VCF, it is slower.