Entering edit mode
18 months ago
mohsamir2016
▴
30
Dear all,
I have an excel file that I created from VCF file for common SNPs across 6 samples. This excel have the chromosomes and the position of the SNPs only (see example table1)
Now I would like to obtain the other information (eg. allels, Genotype, depth, etc) from the VCF files of the 6 samples (i.e. the one that contains these positions).
I tried using AWK command like here for position 23432 on chr. 1 for the 6 file :
awk -F " " '$1=="1" && $2=="23432"' file1.vcf
awk -F " " '$1=="1" && $2=="23432"' file2.vcf
awk -F " " '$1=="1" && $2=="23432"' file3.vcf
awk -F " " '$1=="1" && $2=="23432"' file4.vcf
awk -F " " '$1=="1" && $2=="23432"' file5.vcf
awk -F " " '$1=="1" && $2=="23432"' file6.vcf
he issue is that these SNPs I have are thousands positions, so I need an automated way to do this
Could you advise on that ?
Thanks
I went into the bcftools view -R but I could not understand it from the documentation. Could you please give me an example code that I can run and test the results ?
Thanks
what don't you understand from the documentation ?