Select one allele at heterozygote position
0
0
Entering edit mode
8.7 years ago
stolarek.ir ▴ 700

Hi,

Having a vcf or plink ped file with called variants I want in positions, where there are 2 possible alleles select only one of them (based on which one had better coverage in the sample) and from this forming a plink format file, where this allele is ina form of diploid homozygote.

Is there any straightforward way for this or I have to code my way out of this?

Kind regards

SNP vcf plink • 1.7k views
ADD COMMENT
0
Entering edit mode

it depends of the VCF INFO and FORMAT available in your input.

ADD REPLY
0
Entering edit mode
CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample_028_bwa_LongSeed_sorted_hg19.bam
1   2713327 SNPA2252455 C   T   9.6729  .   .   GT:PL   0/1:37,3,0
1   2865778 SNPA2060958 T   .   67  .   .   GT  0/0
ADD REPLY
0
Entering edit mode

If anyone will be interested I worked this out on ped plink files. First I recode A,C,T,G coding to 0,1,2 with --recode12,

then parse the ped file with this python script:

import random
import itertools

infile = open("subset.ped","r")
outfile = open("new_genotypes.ped","w")

check_11 = ('1','1')
check_22 = ('2','2')
def random_gen():
    choice_m = [check_11,check_22]
    res = random.choice(choice_m)
    return res

def parse_ped(ped_infile):
    check_12 = ('1','2')
    check_21 = ('2','1')

for line in ped_infile:
    row = line.split()
    head = row[0:6]
    genotype_old = row[6:]
    genotype_f = row[6:]
    it = iter(genotype_f)
    genotype_t = zip(it, it)
    for n,i in enumerate(genotype_t):
        if i == check_12 or i == check_21:
            genotype_t[n]=random_gen()
    print genotype_t[0:10]
    genotype_t = list(itertools.chain(*genotype_t))
    print genotype_t[0:10]

    new_line = head+genotype_t
    outfile.write(" ".join(new_line)+"\n")
outfile.close()

parse_ped(infile)
ADD REPLY

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6