Hello
I have a GRanges object consisting of a list of single nucleotide positions. Each row has an associated variant ID which is stored in a data frame in elementMetadata(all.vcf)
as such:
library(GenomicRanges)
all.vcf
GRanges with 6378 ranges and 1 metadata column:
seqnames ranges strand | ID_names
<Rle> <IRanges> <Rle> | <character>
[1] CHR01 [ 16725, 16725] * | CHR01.16725
[2] CHR01 [ 46270, 46270] * | CHR01.46270
[3] CHR01 [ 46282, 46282] * | CHR01.46282
[4] CHR01 [ 64420, 64420] * | CHR01.64420
[5] CHR01 [109016, 109016] * | CHR01.109016
... ... ... ... ... ...
[6374] CHR05 [2939237, 2939237] * | CHR05.2939237
[6375] CHR05 [2965552, 2965552] * | CHR05.2965552
[6376] CHR05 [2981136, 2981136] * | CHR05.2981136
[6377] CHR05 [3096084, 3096084] * | CHR05.3096084
[6378] CHR05 [3154633, 3154633] * | CHR05.3154633
---
seqlengths:
CHR01 CHR02 CHR03 CHR04 CHR05
1325633 1428053 1944125 2519115 3319307
I am trying to subset this Granges object based on the variant IDs stored in the elementMetadata
data frame using a character vector called filt.ids
which only contains the variant IDs I'm interested in. I have tried:
all.vcf[which(elementMetadata(all.vcf)[,1] == filt.ids)]
But this returns an error as there are far less variants in filt.ids
than there are in the Granges object (longer object length is not a multiple of shorter object length
).
Any ideas?
Thanks, that worked perfectly!