I am trying to follow the tutorial for Karyotyper to plot genomic events (I have gene ID and frequency) on hg38 track. I am following the example for gene expression (https://bernatgel.github.io/karyoploter_tutorial//Examples/GeneExpression/GeneExpression.html). But I am struck at the stage where I join my data to the Granges object as metadata columns. Hg38 doesn't have the gene name, but gene ID (I am not able to understand which ID it is). But the TxDb.Dmelanogaster.UCSC.dm6.ensGene (in the tutorial) has gene names in it. I am not familiar with GRanges, so I am not able to figure out how to add my data to the GRanges object of hg38, to proceed with the plotting. hg38 has transcript IDs.
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
hm.genes <- genes(txdb)
hm.genes
GRanges object with 24183 ranges and 1 metadata column:
seqnames ranges strand | gene_id
<Rle> <IRanges> <Rle> | <character>
1 chr19 [58345178, 58362751] - | 1
10 chr8 [18391245, 18401218] + | 10
100 chr20 [44619522, 44651742] - | 100
1000 chr18 [27950966, 28177446] - | 1000
100009613 chr11 [70072434, 70075433] - | 100009613
... ... ... ... . ...
9991 chr9 [112217716, 112333667] - | 9991
9992 chr21 [ 34364024, 34371389] + | 9992
9993 chr22 [ 19036282, 19122454] - | 9993
9994 chr6 [ 89829894, 89874436] + | 9994
9997 chr22 [ 50523568, 50526439] - | 9997
txgr <- transcripts(txdb)
txgr
GRanges object with 82960 ranges and 2 metadata columns: seqnames ranges strand | tx_id tx_name
<Rle> <IRanges> <Rle> | <integer> <character>
[1] chr1 [ 11874, 14409] + | 1 uc001aaa.3
[2] chr1 [ 11874, 14409] + | 2 uc010nxq.1
[3] chr1 [ 11874, 14409] + | 3 uc010nxr.1
[4] chr1 [ 69091, 70008] + | 4 uc001aal.1
[5] chr1 [321084, 321115] + | 5 uc001aaq.2
... ... ... ... . ... ...
[82956] chrUn_gl000237 [ 1, 2686] - | 82956 uc011mgu.1
[82957] chrUn_gl000241 [20433, 36875] - | 82957 uc011mgv.2
[82958] chrUn_gl000243 [11501, 11530] + | 82958 uc011mgw.1
[82959] chrUn_gl000243 [13608, 13637] + | 82959 uc022brq.1
[82960] chrUn_gl000247 [ 5787, 5816] - | 82960 uc022brr.1
My data:
head(data) Gene Freq ID
1 AATF 1 uc002hni.3
2 ACAD10 1 uc001tsq.4
3 ACADSB 1 uc001lhb.4
4 AGO3 1 uc001bzn.1
5 AJUBA 1 uc001why.5
6 ALOX15B 1 uc002gju.4
My goal is to plot the frequency of the genes on the chromosome. Could you please help with tweaking the code for hg38 to add my data to the GRanges object?
Thank you! Neetha
Thanks Bastein Herve. It worked!