Hi
I don't have any experience of creating vcf file. Now I have a list of specific SNPs (csv file with SNPs ID) and would like to create a vcf file of these target SNPs. Could you please let me know how to do it? Thank you!
Kate
Hi
I don't have any experience of creating vcf file. Now I have a list of specific SNPs (csv file with SNPs ID) and would like to create a vcf file of these target SNPs. Could you please let me know how to do it? Thank you!
Kate
If you want to use vcftools
you can select SNPs either by ID or positions
with --snps file_listing_snpIDs
or
with --positions file_listin_chr_and_positions
check the manual for more information: http://vcftools.sourceforge.net/man_latest.html
For example, this could be a command:
vcftools --vcf input_file.vcf --snps mySNPs.txt --recode --recode-INFO-all --out SNPs_only
where mySNPs.txt looks like this:
rs12121
rs242343
rs2348724
.
.
.
assuming you have a list of human rs ID## you can just collect the lines from the NCBI VCF. ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/ using GATK selectVariant with option --keepIDs
List of variant IDs to select: If a file containing a list of IDs is provided to this argument, the tool will only select variants whose ID field is present in this list of IDs. The matching is done by exact string matching. The expected file format is simply plain text with one ID per line.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
How would the file
file_listin_chr_and_positions
have to look like? I didn't find it in the manual.Maybe like this?
Edit: Ok I thought I could select SNPs within a certain genomic range with that list, but that is not the case apparently.