How to extract all the IDs of VCF file for GWAS analyses?
1
0
Entering edit mode
2.2 years ago
Qingyang Xiao ▴ 160

Hi,

How can I extract the genotyped ID out of the VCF file?

I need to select a subset of samples for GWAS analyses.

Thanks!

VCF GWAS ID sample • 1.6k views
ADD COMMENT
0
Entering edit mode

Hi Kevin,

The problem is slightly different. Here I only want the list IDs, not VCF file containing information of a subset of IDs.

Some simple way to extract all IDs? Thanks!

ADD REPLY
0
Entering edit mode

what do you mean with ID ? ID is the 3rd column of the VCF.

ADD REPLY
1
Entering edit mode
2.2 years ago

Take a look:

bcftools view -h Merge.UnFiltered.TumorOnly.vcf.gz | tail -1 ;
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  1174-N  1174-T  1292_CD19-T 1292_noCD19-N   1299-N  1299-T  1347-N  1347-T1378-N    1378-T  1584-N  1584-T  1696-N  1696-T  1700-N  1700-T  1823-N  1823-T  2111-N  2111-T  2217-N  2217-T  2297_CD19-T 2297_noCD19-N   2327_CD19-T 2327_noCD19-N   2343-N  2343-T  2474-N  2474-T  2514-N  2514-T  2530-N  2530-T  2584-N  2584-T  2596_CD19-T 2596_noCD19-N   2623-N  2623-T2822-N    2822-T  2896-N  2896-T  2925-N  2925-T  2946_CD19-T 2946_noCD19-N   2983-N  2983-T  2994-N  2994-T  3034_CD19-T 3034_noCD19-N   3053-N3053-T    3068-N  3068-T  3112-N  3112-T  3146_CD19-T 3146_noCD19-N   3239-N  3239-T  3252-N  3252-T  3396-N  3396-T  3453-N  3453-T  3498-N  3498-T3505-N    3505-T  3520_CD19-T 3520_noCD19-N   3540-N  3540-T  3583-N  3583-T  3592-N  3592-T  3597-N  3597-T  3764-N  3764-T  3819-N  3819-T  3833-N3833-T    3926-N  3926-T  3929-N  3929-T  3953-N  3953-T  4000_CD19-T 4000_noCD19-N   4054-N  4054-T  4074-N  4074-T  4090_CD19-T 4090_noCD19-N4119-N 4119-T  4152-N  4152-T

..or:

bcftools view -h Merge.UnFiltered.TumorOnly.vcf.gz | tail -1 |\
  awk -F "\t" '{for (i=1; i<=NF; i++) {if (i>9) {print $(i)}}}' ;
1174-N
1174-T
1292_CD19-T
1292_noCD19-N
1299-N
1299-T
1347-N
1347-T
1378-N
1378-T
1584-N
1584-T
1696-N
1696-T
1700-N
1700-T
1823-N
1823-T
2111-N
2111-T
...

...or:

bcftools view -h Merge.UnFiltered.TumorOnly.vcf.gz | tail -1 |\
  sed 's/\t/\n/g' | sed '1,9d' ;
1174-N
1174-T
1292_CD19-T
1292_noCD19-N
1299-N
1299-T
1347-N
1347-T
1378-N
1378-T
1584-N
1584-T
1696-N
1696-T
1700-N
1700-T
1823-N
1823-T
2111-N
2111-T
...

Kevin

ADD COMMENT
1
Entering edit mode

or

bcftools query -l in.vcf
ADD REPLY
1
Entering edit mode

En effet, Monsieur.

ADD REPLY

Login before adding your answer.

Traffic: 1292 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6