Convert vcf to specified input
1
0
Entering edit mode
8 weeks ago
thomm80 • 0

Hello, I need to convert my vcfs to a specified input for an analysis. The format that I need is as follows:

scaffold pos ind1 ind2 ind3 etc ...
scf 1 T C T Y ...

where T corresponds to a "TT" genotype, C is "CC" and Y is "CT", and so on.

I'm not sure how to do this and would appreciate any assistance.

Thanks!

snp vcf • 220 views
ADD COMMENT
0
Entering edit mode
8 weeks ago

I just sed T/T to T here, you'll have to complete the sed expression with the other combinations.

$ bcftools query -f '[%CHROM:%POS\t%SAMPLE\t%TGT\n]' invcf.gz | sed 's%T/T$%T%' | datamash crosstab 1,2 unique 3

    S1  S2  S3  S4  S5
RF01:970    A/A A/A A/A A/A C/C
RF02:1726   T   T/G T/G T   T
RF02:251    A/A A/T A/T A/A A/A
RF02:578    G/G G/G G/G A/A G/G
RF02:877    T/A T   T   T   T
RF03:1221   C/C G/G G/G C/C C/C
RF03:1242   C/C C/C C/C A/A C/C
RF03:1688   T   T   T   T   G/G
RF03:1708   G/G G/G G/G G/G T
RF03:2150   T/A T   T   T   T
RF03:2201   G/G G/C G/C G/G G/G
RF03:2315   G/G G/C G/C G/C G/G
(...)
ADD COMMENT

Login before adding your answer.

Traffic: 2477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6