Splitting VCF into 10mb window subfiles for a given chromosome
1
0
Entering edit mode
2.4 years ago

Hello everyone, is there an efficient way to split a given vcf file (lets say for chromosome 1) into several sub-vcf files each of which contain 10mb each pertaining to the same chromosome? Many thanks in advance!

vcf bedtools bcftools • 689 views
ADD COMMENT
1
Entering edit mode
2.4 years ago

I wrote http://lindenb.github.io/jvarkit/VcfToIntervals.html

(not tested)

bcftools view in.vcf.gz |  java -jar dist/vcf2intervals.jar --bed --distance "10mb" --min-distance 0 | awk '{printf("%s:%d-%s\n",$1,int($2)+1,$3);}' | while read R
do
    bcftools view -O z -o "${R//[:-]/_}.out.vcf.gz" "in.vcf.gz" "${R}"
done
ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6