Hi all,
I'm I found overlaps between couple of genes and TCGA copy number calls but I've been unsuccessful in appendding metadata (hgnc_symbol) to subjectHits
data look like this; 1. sample of genes of interest
> my.cordinates.hg19.gr
GRanges object with 3 ranges and 3 metadata columns:
seqnames ranges strand | hgnc_symbol band gene_biotype
<Rle> <IRanges> <Rle> | <character> <character> <character>
[1] 8 128747680-128753674 * | MYC q24.21 protein_coding
[2] 5 1253262-1295184 * | TERT p15.33 protein_coding
[3] 17 7565097-7590856 * | TP53 p13.1 protein_coding
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths
TCGA absolute copy calls in Granges;
> Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities
GRanges object with 159857 ranges and 21 metadata columns:
seqnames ranges strand | Sample Num_Probes Length Modal_HSCN_1 Modal_HSCN_2 Modal_Total_CN Subclonal_HSCN_a1 Subclonal_HSCN_a2 Cancer_cell_frac_a1 Ccf_ci95_low_a1
<Rle> <IRanges> <Rle> | <character> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
[1] 1 564621-1510801 * | TCGA-3C-AAAU_WHITE 53 946180 1 2 3 0 1 0.25 0.08383
[2] 1 1688192-16142960 * | TCGA-3C-AAAU_WHITE 4180 14454768 1 2 3 0 1 0.06 0.02732
[3] 1 16165661-63472103 * | TCGA-3C-AAAU_WHITE 13591 47306442 1 2 3 0 1 0.04 0.00615
[4] 1 63472868-72759524 * | TCGA-3C-AAAU_WHITE 3196 9286656 1 2 3 0 0 0.03 0.00000
[5] 1 72811904-85632596 * | TCGA-3C-AAAU_WHITE 4069 12820692 1 2 3 0 0 0.04 0.00292
... ... ... ... . ... ... ... ... ... ... ... ... ... ...
[159853] 21 28284150-39907857 * | TCGA-Z7-A8R6_WHITE 4229 11623707 1 1 2 0 0 0.00 0.00000
[159854] 21 39908195-48084820 * | TCGA-Z7-A8R6_WHITE 3214 8176625 1 1 2 0 0 0.05 0.01384
[159855] 22 16055207-24329711 * | TCGA-Z7-A8R6_WHITE 1878 8274504 1 1 2 0 0 0.05 0.00946
[159856] 22 24402321-44489022 * | TCGA-Z7-A8R6_WHITE 6479 20086701 1 1 2 0 0 0.04 0.00644
[159857] 22 44493823-51199978 * | TCGA-Z7-A8R6_WHITE 2807 6706155 1 1 2 0 0 0.06 0.02714
Ccf_ci95_high_a1 Cancer_cell_frac_a2 Ccf_ci95_low_a2 Ccf_ci95_high_a2 LOH Homozygous_deletion solution Type race ethnicity
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <character> <character> <character> <character>
[1] 0.39979 0.74 0.58132 0.89771 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[2] 0.08989 0.85 0.81351 0.87773 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[3] 0.06643 0.91 0.87122 0.92977 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[4] 0.05842 0.03 0.00000 0.05981 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[5] 0.06738 0.06 0.02157 0.08639 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
... ... ... ... ... ... ... ... ... ... ...
[159853] 0.02834 0.00 0.00000 0.02834 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[159854] 0.06983 0.01 0.00000 0.03265 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[159855] 0.07308 0.05 0.00946 0.07308 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[159856] 0.05948 0.04 0.00644 0.05948 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
[159857] 0.08524 0.06 0.02714 0.08524 0 0 new BRCA WHITE NOT HISPANIC OR LATINO
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
>
find overlaps between TCGA asbolute copy calls and ancestry- related genes
overlaps.tcga.ancestry.genes <- findOverlaps(query=my.cordinates.hg19.gr, subject=Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities, type = "within") ##complete overlaps
create empty column to append gene names for easy filtering for downstream analysis
Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities$Ancestry_genes <- NA
> str(Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities$Ancestry_genes) ##sanity checks
logi [1:159857] NA NA NA NA NA NA ...
extract indexes of subject hits
sub_hits.overlaps <- as.data.frame(Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities[subjectHits(overlaps.tcga.ancestry.genes)])
str(sub_hits.overlaps$Ancestry_genes) ##sanity checks
append genes names to corresponding overlap indexes in subject hits
Granges.merged.df.tcga.mastercalls.abs.BRCA.ID.ethnicities[queryHits(overlaps.tcga.ancestry.genes)]$Ancestry_genes <- sub_hits.overlaps[,grepl("hgnc_symbol",names(sub_hits.overlaps))]
Error in do.call(`[<-`, args) : replacement has length zero
Thanks for your help
Thanks! this was helpful!