I have a set of polymorphisms and their position on each chromosome in a text file. I also have vcf files of variants (1000 genomes). I want to identify the number of variants within 1kb windows of each polymorphism.
To do this I think I need to create bed files for each appropriate window.
So for example, if I have polymorphisms at the following positions:
2000
3000
4000
5000
The first bed file would show start and end positions:
1500 2499
2500 3499
3500 4499
4500 5499
Then the second bed file would show:
1000 2999
2000 3999
3000 4999
4000 5999
Because I'd like to know the number in same size windows I'd then subtract the number of variants found from the first bed file from the number in the second.
And so on and so forth for the length of the chromosome..
I know how to count the variants, I just don't know how to automate the creation of such a vast number of bed files. Does anyone have any suggestions?