Entering edit mode
2.7 years ago
cisplatin1987
▴
10
Hello,
I am using R to find out the corresponding gene that covers the position 10736809, chromosome 1:
> library(TxDb.Hsapiens.UCSC.hg38.knownGene)
> library(VariantAnnotation)
> txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
> rng <- GRanges(seqnames='chr1',ranges=IRanges(10736809,10736809))
However, locateVariants() gives a report like this:
> locateVariants(rng, txdb, AllVariants())
'select()' returned many:1 mapping between keys and columns
GRanges object with 9 ranges and 9 metadata columns:
seqnames ranges strand | LOCATION LOCSTART LOCEND QUERYID TXID CDSID GENEID PRECEDEID
<Rle> <IRanges> <Rle> | <factor> <integer> <integer> <integer> <character> <IntegerList> <character> <CharacterList>
[1] chr1 10736809 + | intron 60904 60904 1 10330 <NA>
[2] chr1 10736809 + | intron 63204 63204 1 10331 <NA>
[3] chr1 10736809 + | intron 59598 59598 1 10332 <NA>
[4] chr1 10736809 + | intron 59783 59783 1 10333 <NA>
[5] chr1 10736809 + | intron 59598 59598 1 217346 80781
[6] chr1 10736809 + | intron 60904 60904 1 217347 80781
[7] chr1 10736809 + | intron 63204 63204 1 217348 80781
[8] chr1 10736809 + | intron 59783 59783 1 217350 80781
[9] chr1 10736809 + | intron 208422 208422 1 217350 80781
The Gene ID 80781 refers to a gene in chr21, so totally wrong. The target position actually belongs to the intron of gene CASZ1 (Gene ID 54897). However, txdb seems to have their correct information:
> select(txdb, keys = c("54897", "80781"), columns="TXCHROM", keytype="GENEID")
'select()' returned 1:1 mapping between keys and columns
GENEID TXCHROM
1 54897 chr1
2 80781 chr21
What could be wrong? The basic information of my system:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] VariantAnnotation_1.38.0 Rsamtools_2.8.0 Biostrings_2.60.2
[4] XVector_0.32.0 SummarizedExperiment_1.22.0 MatrixGenerics_1.4.3
[7] matrixStats_0.61.0 TxDb.Hsapiens.UCSC.hg38.knownGene_3.13.0 GenomicFeatures_1.44.2
[10] AnnotationDbi_1.54.1 Biobase_2.52.0 GenomicRanges_1.44.0
[13] GenomeInfoDb_1.28.4 IRanges_2.26.0 S4Vectors_0.30.2
[16] BiocGenerics_0.38.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.8.3 lattice_0.20-45 prettyunits_1.1.1 png_0.1-7 assertthat_0.2.1
[6] digest_0.6.29 utf8_1.2.2 BiocFileCache_2.0.0 R6_2.5.1 RSQLite_2.2.11
[11] httr_1.4.2 pillar_1.7.0 zlibbioc_1.38.0 rlang_1.0.2 progress_1.2.2
[16] curl_4.3.2 rstudioapi_0.13 blob_1.2.2 Matrix_1.4-1 BiocParallel_1.26.2
[21] stringr_1.4.0 RCurl_1.98-1.6 bit_4.0.4 biomaRt_2.48.3 DelayedArray_0.18.0
[26] compiler_4.1.2 rtracklayer_1.52.1 pkgconfig_2.0.3 tidyselect_1.1.2 KEGGREST_1.32.0
[31] tibble_3.1.6 GenomeInfoDbData_1.2.6 XML_3.99-0.9 fansi_1.0.3 crayon_1.5.1
[36] dplyr_1.0.8 dbplyr_2.1.1 GenomicAlignments_1.28.0 bitops_1.0-7 rappdirs_0.3.3
[41] grid_4.1.2 lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.2 cli_3.2.0
[46] stringi_1.7.6 cachem_1.0.6 xml2_1.3.3 ellipsis_0.3.2 filelock_1.0.2
[51] generics_0.1.2 vctrs_0.3.8 rjson_0.2.21 restfulr_0.0.13 tools_4.1.2
[56] bit64_4.0.5 BSgenome_1.60.0 glue_1.6.2 purrr_0.3.4 hms_1.1.1
[61] fastmap_1.1.0 yaml_2.3.5 BiocManager_1.30.16 memoise_2.0.1 BiocIO_1.2.0
Thank you in advance!
I am getting the correct gene, using:
R version 4.0.2, TxDb.Hsapiens.UCSC.hg38.knownGene_3.10.0, VariantAnnotation_1.34.0
I suggest updating your R and Bioconductor packages, and try again.