Entering edit mode
5 months ago
Begonia_pavonina
▴
200
I have 19 samples, and I wish to get the common loci shared by 8 of them. For this I use the code below:
# List of VCF files
vcf_files=(
"Beg1_1.g.vcf" "Beg1_2.g.vcf" "Beg1_4.g.vcf" "Beg1_5.g.vcf"
"Beg2_2.g.vcf" "Beg2_5.g.vcf" "Beg3_3.g.vcf" "Beg3_5.g.vcf"
"Beg3_7.g.vcf" "Beg5_1.g.vcf" "Beg5_7.g.vcf" "Beg6_4.g.vcf"
"Beg6_6.g.vcf" "Beg6_7.g.vcf" "Beg7_3.g.vcf" "Beg8_7.g.vcf"
"Beg9_2.g.vcf" "Beg9_7.g.vcf" "hill.g.vcf" )
# Compress VCF files
for file in "${vcf_files[@]}"; do
bgzip -c "$file" > "${file}.gz" done
# Index the compressed VCF files
for file in "${vcf_files[@]}"; do
tabix -p vcf "${file}.gz" done
# Intersect VCF files
bcftools isec -n=8 "${vcf_files[@]/%/.gz}" > intersected.vcf
But looking at the result I realized that the intersected.vcf
is odd. There is no vcf header, and the file is not really in vcf format.
Would anyone know what I am doing wrong?
1185_polypolish 59005 C <NON_REF> 1110010011011000000
1185_polypolish 59011 T <NON_REF> 1011001011011000000
1185_polypolish 59020 A <NON_REF> 1011001011011000000
1185_polypolish 59025 C T,<NON_REF> 1010011011011000000
1185_polypolish 59028 G A,<NON_REF> 1010011011011000000
1185_polypolish 59031 T C,<NON_REF> 1010011011011000000
1185_polypolish 59061 T C,<NON_REF> 1011001011011000000
1185_polypolish 59062 A <NON_REF> 1011001011011000000
1104_polypolish 108092 T G,<NON_REF> 0001000110011110010
1104_polypolish 108095 T G,<NON_REF> 0001000110011110010
Check the
README.txt
file that should be present in the output folder. You also seem to be using gvcf's.