remove non biallelic SNPs from ped
3
1
Entering edit mode
10.1 years ago

Hello

When merging a list of bed files in plink I get an error

Error: 1785 variants with 3+ alleles present.

The files were originally in vcf format and were converted to bed files using vcftools (with the --plink flag). Is there a simple way to scan through the bed files with plink or another tool and remove them?

R sequence SNP • 22k views
ADD COMMENT
5
Entering edit mode
9.5 years ago

Not sure what you mean exactly. But have you tried

--biallelic-only strict

in plink 1.90_beta_3o? One can find further info on page https://www.cog-genomics.org/plink2/input

ADD COMMENT
0
Entering edit mode

--biallelic-only is retired. Use e.g. --max-alleles 2 instead

ADD REPLY
4
Entering edit mode
10.1 years ago
Scott ▴ 110

Not sure about your exact question, but in VCFtools you can quickly filter for only bi-allelic sites using:

vcftools --vcf_file1.vcf --min-alleles 2 --max-alleles 2 --recode --out output_file_name.vcf
ADD COMMENT
0
Entering edit mode

is there no way to do it in the vcf sry ped file..otherwise is has to be reconverted before...

ADD REPLY
0
Entering edit mode

I just thought if they were originally in VCF format this would be easiest. I am most familiar with VCF, that's why I suggested this. I'm sure there are other ways.

ADD REPLY
3
Entering edit mode
10.1 years ago

If you want to keep some or all of those variants, and then treat least common alternate allele calls as missing, you should perform the merge with another tool, and then use plink --vcf to import from the merged VCF.

However, if you just want to get rid of all the triallelic variants, refer to the last batch of sample commands under https://www.cog-genomics.org/plink2/data#merge3. The .missnp file generated during the failed merge is designed to be used with --exclude.

ADD COMMENT
0
Entering edit mode

I am merging a whole list of files and the problem is that I don't know from which subfile the variants in the mssnp file you mention come from. Is there a way to use this field to exclude the snps still?

ADD REPLY
0
Entering edit mode

Yes, it's safe to --exclude [prefix].missnp on every single fileset. Nothing bad happens if a variant named in the .missnp file is not in the current fileset.

ADD REPLY

Login before adding your answer.

Traffic: 2007 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6