Remove snps with missing names
0
0
Entering edit mode
6.6 years ago
bha ▴ 80

I pruned the 1000G data with MAF, and some LD filtering. I wonder there are some snps with names as "." (snps indentifiers are as "." dot). Any suggestion how i should remove or pull out that ones?

genetics plink • 1.8k views
ADD COMMENT
1
Entering edit mode

Use bcftools. There are two ways (copy/pasted from bcftools manual):

 "." to test missing values

Example:

 bcftools view -i 'ID=="."' test.vcf

.

    -n, --novel
        print novel sites only (ID column is ".")

Example :

    bcftools view -n test.vcf
ADD REPLY
0
Entering edit mode

How i can remove these IDs (".") from the the datasets?

ADD REPLY
0
Entering edit mode

are these datasets in VCF format? if not, please post example dataset/records here.

ADD REPLY
0
Entering edit mode

yes, these are in VCF format.

ADD REPLY
0
Entering edit mode

Like cpad0112 said, use bcftools. bcftools view can be used to subset data when the output is redirected to a file.

ADD REPLY
0
Entering edit mode

replace test. vcf with your dataset.vcf

example code:

 bcftools view -n kg.vcf > new.vcf
ADD REPLY

Login before adding your answer.

Traffic: 2621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6