Hi,
I want to compare the elements in column V2 of two tables (below): I'm interested in rows where the elements of columns are close to each other by 3 - 6. The two tables may have different numbers of rows.
For example, row 1 in both tables 1 and 2 differ by 3, I want to add the two elements together and output the mean along with other elements wherever the criterion is satisfied (difference of between 3 - 6) occurs as in the example output.
Table1:
V1 V2 V3 V4
TX14566 200 27 10
TX14566 187 27 10
TX14566 723 27 10
TX14566 17 27 10
Table2:
V1 V2 V3 V4
TX14566 203 27 10
TX14566 200 27 10
TX14566 729 27 10
TX14566 39 27 10
Example output (only rows 1 and 3 satisfying the criterion are outputted):
V1 V2 V3 V4
TX14566 202 27 10
TX14566 726 27 10
I have tried using merge function, thus:
xx <- merge(Tab1,Tab2,by=intersect(Tab1$V2,Tab1$V2+3:4)
which did not work. I'm unsure about the correct way to write the code in R and would be grateful for help. Thanks
Please make it more clear. If two tables can have different number of rows than what should be used as an primary identifier to know which two exact rows should be compared. Here all the rows have the same primary identifier i.e. TX14566. Also, what if the values are not different in the first column of a row but some other column. Do you still want to take the mean or ignore them. What if values differ by more than 6 units in one column and less than 3 in other columns. You will have to come up with clear rules. Take some time and create a good example.