extract specific population (vcf) from 1000 genomes - keep option
1
0
Entering edit mode
10.2 years ago
muralinmars ▴ 100

hi all,

how can I extract 1000 genome data in vcf format of a specific population using vcftools. Currently I am using tabix to download the data

tabix -fh ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr
echo ${REGION} | awk -F [\:] '{print $1}'`.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz $REGION > temp.vcf

and then include only the individuals from european population (CEU,GBR,FIN,TSI); through (--keep) option by providing a txt file (pop_file) including their individualIDs,,,

vcftools --vcf temp.vcf --keep $POP_FILE --recode > temp.vcf

where am I going wrong - As I producing the initial Vcf file but not after the filtering for specific pop.

Any suggestions PLEASE

sequencing linux vcftools • 7.2k views
ADD COMMENT
2
Entering edit mode
10.2 years ago

When using

vcftools --vcf temp.vcf --keep $POP_FILE --recode > temp.vcf

the file temp.vcf will be overwritten before vcftools is started.

You're working with an empty file

Use

vcftools --vcf temp.vcf --keep $POP_FILE --recode > temp2.vcf
ADD COMMENT
0
Entering edit mode

Thanks... I tried to change the name of the VCF file (to temp2) generated after keep option,,,,

But it does not create the temp2 file (while the recode.vcf file is generated properly after filtering for specific population)

here is the code again

#select specific population
if [ "$POP_FILE" != "" ]; then
    vcftools --vcf temp.vcf --keep $POP_FILE --recode --out popfilter > temp2.vcf 2> /dev/null
else
    cp -f temp.vcf temp2.vcf
fi
ADD REPLY
0
Entering edit mode

You're redirecting stderr to /dev/null what was the content of the error message?

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6