Entering edit mode
7.7 years ago
Nicolas Rosewick
11k
Hi,
I've two GRanges as :
> gene.pos.gr
GRanges object with 63677 ranges and 1 metadata column:
seqnames ranges strand | ensembl_gene_id
<Rle> <IRanges> <Rle> | <character>
[1] HG991_PATCH [66119285, 66465398] + | ENSG00000261657
[2] 13 [23551994, 23552136] - | ENSG00000223116
[3] 13 [23708313, 23708703] + | ENSG00000233440
[4] 13 [23726725, 23726825] - | ENSG00000207157
[5] 13 [23743974, 23744736] - | ENSG00000229483
... ... ... ... . ...
[63673] HSCHR17_1_CTG1 [ 62293, 253878] - | ENSG00000262334
[63674] HSCHR17_1_CTG1 [189016, 190077] + | ENSG00000262737
[63675] HSCHR17_1_CTG1 [198829, 201112] + | ENSG00000263267
[63676] HSCHR17_1_CTG1 [220262, 225205] + | ENSG00000262336
[63677] HSCHR17_1_CTG1 [222361, 224506] + | ENSG00000262005
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
and
> exon.pos.gr
GRanges object with 738009 ranges and 1 metadata column:
seqnames ranges strand | ensembl_gene_id
<Rle> <IRanges> <Rle> | <character>
[1] HG991_PATCH [66119285, 66119659] + | ENSG00000261657
[2] HG991_PATCH [66298434, 66298819] + | ENSG00000261657
[3] HG991_PATCH [66314236, 66314392] + | ENSG00000261657
[4] HG991_PATCH [66320895, 66321004] + | ENSG00000261657
[5] HG991_PATCH [66339743, 66339847] + | ENSG00000261657
... ... ... ... . ...
[738005] HSCHR17_1_CTG1 [200680, 201068] + | ENSG00000263267
[738006] HSCHR17_1_CTG1 [220262, 220486] + | ENSG00000262336
[738007] HSCHR17_1_CTG1 [224683, 225205] + | ENSG00000262336
[738008] HSCHR17_1_CTG1 [222361, 223093] + | ENSG00000262005
[738009] HSCHR17_1_CTG1 [224327, 224506] + | ENSG00000262005
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
I want to do the difference between them so I used difference <- setdiffgene.pos.gr,exon.pos.gr)
It gives me :
> difference
GRanges object with 290454 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] HG991_PATCH [66119660, 66298433] +
[2] HG991_PATCH [66298820, 66314235] +
[3] HG991_PATCH [66314393, 66320894] +
[4] HG991_PATCH [66321005, 66339286] +
[5] HG991_PATCH [66339343, 66339742] +
... ... ... ...
[290450] HSCHR2_2_CTG12 [149868059, 149870556] +
[290451] HSCHR2_2_CTG12 [149870662, 149872728] +
[290452] HSCHR2_2_CTG12 [149872946, 149874163] +
[290453] HSCHR2_2_CTG12 [149874278, 149882769] +
[290454] HSCHR2_2_CTG12 [149882977, 149885671] +
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
How to keep the extra columns from the gene.pos.gr GRanges, especially the ensembl_gene_id column
Thanks
I am not sure if you can do that directly using any of the set functions (intersect, union etc) of GRanges. The alternative is to
findOverlap
between yourgene.pos.gr
anddifference
, and using that, get the required metadata and add it to thedifference
. For example, see https://stat.ethz.ch/pipermail/bioconductor/2013-August/054486.html