New to R.
Problem: I have created objects a
and b
, a contains 1,200,000 rows and b
contains 5000 rows. I want to repeatedly sample 1000 rows from a
and compare the items in a
to those items in b
to find out how many overlapping items exist.
All I know is how to create an object:
a <- c(1,2,3,....L)
b <- c(1,2,3,....L)
...and, randomly sample a
:
c <- sample(a, 1000)
However, I don't know how to compare a
and b
to determine the number of overlapping items. My attempt at this with a simple if, then statement returned an error:
if (c==b) "TRUE" else "FALSE"
[1] "FALSE"
Warning messages:
1: In c == b :
longer object length is not a multiple of shorter object length
2: In if (c == b) "TRUE" else "FALSE" :
the condition has length > 1 and only the first element will be used
First problem, I need to know how to identify overlapping items between objects. Secondly, I need to repeat this multiple times (~10,000), perhaps through a for() loop?
Any insight will be much appreciated.
Are
a
andb
just numbers or are they genomic positions (e.g. from aGRanges
object)? The former case is quite simple:The latter case will depend on the object, though many objects allow intersects and comparisons. BTW, avoid
for
loops in R, they're really slow.