Selecting genotypes for a specific chromosome belonging to a specific patient from a multi patient VCF file
2
0
Entering edit mode
3.7 years ago
dk0319 ▴ 70

I have a VCF file called HG002-HG003-HG004.jointVC.filter.vcf that consists of data from three patients. I am interested in isolating the genotypes for chr19 from patient HG002 only. Is there a way to do this using vcftools or alternative package? And can anyone recommend a manual or an example on how to accomplish this?

sequence • 1.6k views
ADD COMMENT
2
Entering edit mode
3.7 years ago
desouzareis.r ▴ 280

Hi,

You could try bcftools. something like this:

bgzip -c HG002-HG003-HG004.jointVC.filter.vcf > HG002-HG003-HG004.jointVC.filter.vcf.gz
tabix -p vcf HG002-HG003-HG004.jointVC.filter.vcf.gz
bcftools view -s HG002 -r chr19 HG002-HG003-HG004.jointVC.filter.vcf.gz >HG002.chr19.vcf
ADD COMMENT
0
Entering edit mode

I tried this command and got this error.

Failed to open HG002-HG003-HG004.jointVC.filter_Annotated.vcf: not compressed with bgzip

Any idea how to correct this?

ADD REPLY
1
Entering edit mode

Hi,

You should compress and index your vcf file before

bgzip -c file.vcf > file.vcf.gz
tabix -p vcf file.vcf.gz
ADD REPLY
0
Entering edit mode
3.7 years ago
dk0319 ▴ 70
grep "HG002" HG002-HG003-HG004.jointVC.filter_Annotated.vcf #finds patient ID, for use with [--indv]



vcftools --vcf HG002-HG003-HG004.jointVC.filter_Annotated.vcf  --chr 19 --indv Sample_Diag-excap51-HG002-EEogPU  --out HG002-chr19.vcf --recode

Found that is this works to generate a file consisting of only chr19 coordinates from patient HG002

ADD COMMENT
0
Entering edit mode

Use programs meant for handling VCF files whenever possible. Utilities like grep may miss subtle things.

ADD REPLY

Login before adding your answer.

Traffic: 2585 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6