How to subset a set of variations from a VCF on specific chromosome and between 2 postions?
1
0
Entering edit mode
5.5 years ago
NT ▴ 20

Hi,

I'm a very beginner on using bash so my question may seem stupid for some of you. I have a VCF annotated file with a big number of samples. I want to subset a file from this one with all the variations of a gene (located on the chromosome ($1 = chr9) and between the position ($2 = POS) 81583683 and 81689305. I used the awk command after modifications awk '{$1== "chr9" && 81583683 <$2< 81689305}' VCF1 > VCF2 but had always error message.

Can anyone tell me please if the awk command is correct in this case for selection with 2 conditions or I should use another command?

Thank you

awk bash vcf subset • 1.3k views
ADD COMMENT
0
Entering edit mode

Thank u for help! I used the command of bcftools after indexing the vcf file. my command line looks like this: bcftools view file1.vcf.gz "chr9:81583683-81689305" -O v file2.vcf. It works but it doesn't return all the variations that i want to get, just some of them while I want to get all the variations even the duplicated one.

ADD REPLY
1
Entering edit mode

while I want to get all the variations even the duplicated one

show us the variants ignored by the command above

ADD REPLY
0
Entering edit mode

Its huge number of variations ignored (I have file with 800 samples and i want to search the variations for all the samples in this region). The command generates only some of variation and just once ( for exemple, if a variation appears in 5 samples, i want to find 5 lines with this variation in the generated file, however with this line command, either I don't find it in the generated file or i find it just one time (on line))

ADD REPLY
1
Entering edit mode

that's still not clear to me

ADD REPLY
2
Entering edit mode
5.5 years ago

you want:

 awk -F '\t' '($0 ~ /^#/ || ("chr9" && 81583683 <$2 && $2< 81689305))' VCF1 > VCF2

or, better, after indexing the VCF1:

bcftools view vcf1.vcf.gz "chr9:81583683-81689305"
ADD COMMENT

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6