Question

Subsetting Granges based on its metadata

3

Entering edit mode

11.0 years ago

stefano.iantorno ▴ 70

Hello

I have a GRanges object consisting of a list of single nucleotide positions. Each row has an associated variant ID which is stored in a data frame in elementMetadata(all.vcf) as such:

library(GenomicRanges)

all.vcf
GRanges with 6378 ranges and 1 metadata column:
         seqnames             ranges strand   |      ID_names
            <Rle>          <IRanges>  <Rle>   |   <character>
     [1]    CHR01   [ 16725,  16725]      *   |   CHR01.16725
     [2]    CHR01   [ 46270,  46270]      *   |   CHR01.46270
     [3]    CHR01   [ 46282,  46282]      *   |   CHR01.46282
     [4]    CHR01   [ 64420,  64420]      *   |   CHR01.64420
     [5]    CHR01   [109016, 109016]      *   |  CHR01.109016
     ...      ...                ...    ... ...           ...
  [6374]    CHR05 [2939237, 2939237]      *   | CHR05.2939237
  [6375]    CHR05 [2965552, 2965552]      *   | CHR05.2965552
  [6376]    CHR05 [2981136, 2981136]      *   | CHR05.2981136
  [6377]    CHR05 [3096084, 3096084]      *   | CHR05.3096084
  [6378]    CHR05 [3154633, 3154633]      *   | CHR05.3154633
  ---
  seqlengths:
     CHR01   CHR02   CHR03   CHR04   CHR05
   1325633 1428053 1944125 2519115 3319307

I am trying to subset this Granges object based on the variant IDs stored in the elementMetadata data frame using a character vector called filt.ids which only contains the variant IDs I'm interested in. I have tried:

all.vcf[which(elementMetadata(all.vcf)[,1] == filt.ids)]

But this returns an error as there are far less variants in filt.ids than there are in the Granges object (longer object length is not a multiple of shorter object length).

Any ideas?

genome R • 15k views

ADD COMMENT • link updated 3.6 years ago by Ram 45k • written 11.0 years ago by stefano.iantorno ▴ 70

3

Entering edit mode

8.1 years ago

msameet ▴ 50

Another handy trick is to use the column name of the metadata directly. e.g.

all.vcf[(elementMetadata(all.vcf)[, "Name"] %in% filt.ids)]

ADD COMMENT • link updated 3.6 years ago by Ram 45k • written 8.1 years ago by msameet ▴ 50

Ram · Accepted Answer · 2014-07-31

11

Entering edit mode

11.0 years ago

Michael 56k

all.vcf[(elementMetadata(all.vcf)[,1] %in% filt.ids)]
______________________________________^^^^

ADD COMMENT • link updated 3.6 years ago by Ram 45k • written 11.0 years ago by Michael 56k

0

Entering edit mode

Thanks, that worked perfectly!

ADD REPLY • link 11.0 years ago by stefano.iantorno ▴ 70