How to extract specific SNPs from a vcf file and create new a one consisting of the extracted loci.
1
0
Entering edit mode
2.2 years ago
Shin Taguchi ▴ 40

Hi everyone.

I currently have a vcf file containing SNP information for a total of 3831 loci. From this vcf file, I would like to generate a vcf file consisting only of loci whose POS column (second column) is the position shown below.

scaffold_8754:711 scaffold_56662:2153 scaffold_56764:16891 scaffold_70342:19238 scaffold_70335:4963 scaffold_65968:2460 scaffold_3433:598 scaffold_40074:5854 scaffold_70209:66631 scaffold_71244:898110 scaffold_71250:456730 scaffold_66351:1452 scaffold_70132:55627 ...

Although omitted above, there are actually 1000 loci that I want to extract.

Could I do this using grep command? Or any other ideas or advice would be appreciated.

Thank you.

vcf grep shellscript SNP • 1.2k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode
2.2 years ago
Zhitian Wu ▴ 60

So you want to extract 1000 loci from 3831 loci?

You can keep 1000 loci in a file "loci_1000", one locus a line, then

cat loci_1000 |
xargs -i grep {} loci_3831.vcf > loci_1000.vcf

Hope this will help.

ADD COMMENT
0
Entering edit mode

So you want to extract 1000 loci from 3831 loci?

Yes.

Thank you for your comment! I will try this script.

ADD REPLY

Login before adding your answer.

Traffic: 2036 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6