Hi Guys,
I have a data-frame "mydata" with 200 columns, and over 15.000 Rows
- I would like to compare the two columns starting from the first column (
XR_res1
) and check if the contents match. For example, I want to compare columnXR_res3
with columnXR_res5
and get the concordance column in result with match or mismatch decision - Group by "Personal_ID" all 15.000 rows and compare mismatches with column result of match or mismatch decision in percents
Thanks
Personal_ID XR_res1 XR_res2 XR_res3 XR_res4 XR_res5 XR_res6
001 pos pass pos neg pos neg
001 pos pass neg pass pass pass
001 neg neg neg pos neg pass
002 pass pos pos pass pass pos
002 pos pass pass neg pass pos
003 pass neg neg pos pass pos
003 pos neg pass pass pos pos
003 pass pos pos pass pass neg
003 neg pos neg pass pos neg
Hi Sean, Thanks for your answer, This solution isn't really working for me,
Sorry I wasn't clear enough in describing the problem
As a first step I simply need to compare 2 columns (with non numeric values) to each other for similarity
For instance comparing
XR_res3
with columnXR_res5
to check and get a result of mismatch in percents, here I probably need to use some likeall.equal
but can't figure out the syntax.As a step two I need to group data by
Personal_ID
's and calculate mismatch per groupso should compare
XR_res3
with columnXR_res5
for group withPersonal_ID
001, thenXR_res3
with columnXR_res5
for group withPersonal_ID
002 and so on,For that purpose I will build a new data some like:
dt.1<- subset(mydata, select=c("Personal_ID")) dt.2<- subset(mydata, select=c("XR_res3")) dt.3<- subset(mydata, select=c("XR_res5")) dt.x<- cbind(dt.1,dt.2,dt.3)
and my final result should look some like
I suppose I should use
data.table
s[,, by= Personal_ID]
but again can't figure out the syntaxThanks