R programming: genotype concordance
2
5
Entering edit mode
9.6 years ago
MAPK ★ 2.1k

Hi guys,

I have an R programming question. I want to compare the genotype (.GT columns) with the given alleles (.allele columns) in dataframe df1 and see if they are concordant or not. The rule is that if there is only one allele, it should be 0/0. If there are two alleles (for example GA), the genotype should be 0/1 and that is why I have mismatch in the concordance column for GA. So, A, T, G or C individually is 0/0 and in pair combination is 0/1. Based on this information, I want to add new concordance column next to every two columns that are compared with and have match or mismatch result in concordance column. This concordance column should be cbind next to every compared pair of column and get Result. Could you please help me get this done. Thank you.

df1

1.allele  1.GT  2.allele  2.GT
A         0/0   A         0/0
GA        0/0   CT        0/1
C         0/0   G         0/0

Result

1.allele  1.GT  1.Concordance  2.allele  2.GT  2.Concordance
A         0/0   match          A         0/0   match
GA        0/0   mismatch       CT        0/1   match
C         0/0   match          G         0/0   match
R • 4.5k views
ADD COMMENT
0
Entering edit mode

What is the second "Concordance" column for?

ADD REPLY
0
Entering edit mode

second concordance is for 2. columns, and since there is no mismatch it's left blank.

ADD REPLY
0
Entering edit mode

Are there also homozygous SNPs, such as GG or TT?

ADD REPLY
0
Entering edit mode

No, there are not, Thank you.

ADD REPLY
4
Entering edit mode
9.6 years ago
Jimbou ▴ 960

Data:

df <- data.frame(allele1=c("A","AT","C"),GT1=c("0/0","0/0","0/0"),
allele2=c("AT","G","CG"),GT2=c("0/0","0/0","0/0"))

First I create a vector of correct translation. But I am assuming that there is always a 0/1, never a 1/0:

library(stringr)
translation <- function(x) ifelse(str_length(df[,x])>1,"0/1","0/0")
correct <- sapply(c(1,3),translation)

Finally, mis/match:

match.fun <- function(x) ifelse(df[,x] == correct[,(x/2)] , "match" , "mismatch"  )
comparison <- sapply(c(2,4),match.fun)
cbind(df[,1:2],"Concordance1"=comparison[,1],df[,3:4],"Concordance2"=comparison[,2])
ADD COMMENT
0
Entering edit mode

This is awesome, Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6