I'm trying to make customized DB using customProDB. I have tried to splice junction analysis and made annotation files (splicemax, ids, etc.) thanks to tutorial. However, OutputNovelJun function made error. I copied and paseted code and message below.
> library(customProDB)
필요한 패키지를 로딩중입니다: IRanges
필요한 패키지를 로딩중입니다: BiocGenerics
필요한 패키지를 로딩중입니다: parallel
다음의 패키지를 부착합니다: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’: IQR, mad, xtabs
The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min
필요한 패키지를 로딩중입니다: S4Vectors
필요한 패키지를 로딩중입니다: stats4
다음의 패키지를 부착합니다: ‘S4Vectors’
The following objects are masked from ‘package:base’: colMeans, colSums, expand.grid, rowMeans, rowSums
필요한 패키지를 로딩중입니다: AnnotationDbi
필요한 패키지를 로딩중입니다: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.
필요한 패키지를 로딩중입니다: biomaRt
Gene2RefSeq <- read.csv("C:/Users/Admin/Desktop/CustomizedDB/Gene2RefSeq_Parsed.txt", sep = "\t")
pepfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/pepfasta.fasta"
CDSfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/CDSfasta.fasta"
setwd("C:/Users/Admin/Desktop/CustomizedDB/example")
annotation_path <- getwd()
transcript_ids <- as.matrix(Gene2RefSeq[ , 2])
transcript_ids <- transcript_ids[1:1000]
PrepareAnnotationRefseq(genome='hg19', CDSfasta, pepfasta, annotation_path, transcript_ids = transcript_ids, splice_matrix = TRUE)
Build TranscriptDB object (txdb.sqlite) ...
Download the refGene table ... OK
Download the hgFixed.refLink table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
done
Prepare gene/transcript/protein id mapping information (ids.RData) ... done
Prepare exon annotation information (exon_anno.RData) ... done
Prepare protein sequence (proseq.RData) ... done
Prepare protein coding sequence (procodingseq.RData)... done
Prepare exon splice information (splicemax.RData) ... done
There were 16 warnings (use warnings() to see them)
load("splicemax.RData")
load("ids.RData")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
필요한 패키지를 로딩중입니다: GenomicFeatures
필요한 패키지를 로딩중입니다: GenomeInfoDb
필요한 패키지를 로딩중입니다: GenomicRanges
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
bedfile <- "C:/Users/Admin/Desktop/CorrectedSpliceJunctions_Colon.bed"
jun <- Bed2Range(bedfile,skip=1,covfilter=5)
junction_type <- JunctionType(jun, splicemax, txdb, ids)
outf_junc <- paste(getwd(), '/test_junc.fasta',sep='')
library('BSgenome.Hsapiens.UCSC.hg19')
필요한 패키지를 로딩중입니다: BSgenome
필요한 패키지를 로딩중입니다: Biostrings
필요한 패키지를 로딩중입니다: XVector
필요한 패키지를 로딩중입니다: rtracklayer
proteinseq <- read.csv2("C:/Users/Admin/Desktop/CustomizedDB/proteinseq.txt", sep = "\t")
OutputNovelJun <- OutputNovelJun(junction_type, Hsapiens, outf_junc, proteinseq)
Error in loadFUN(x, seqname, ranges) :
trying to load regions beyond the boundaries of non-circular sequence "chr17"
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Korean_Korea.949 LC_CTYPE=Korean_Korea.949 LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C
[5] LC_TIME=Korean_Korea.949
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.42.0
[3] rtracklayer_1.34.2 Biostrings_2.42.1
[5] XVector_0.14.1 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[7] GenomicFeatures_1.26.4 GenomicRanges_1.26.4
[9] GenomeInfoDb_1.10.3 customProDB_1.14.1
[11] biomaRt_2.30.0 AnnotationDbi_1.36.2
[13] Biobase_2.34.0 IRanges_2.8.2
[15] S4Vectors_0.12.2 BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 magrittr_1.5 zlibbioc_1.20.0 GenomicAlignments_1.10.1
[5] BiocParallel_1.8.2 lattice_0.20-35 plyr_1.8.4 stringr_1.2.0
[9] tools_3.3.3 grid_3.3.3 SummarizedExperiment_1.4.0 DBI_0.6-1
[13] digest_0.6.12 Matrix_1.2-8 bitops_1.0-6 RCurl_1.95-4.8
[17] memoise_1.1.0 RSQLite_1.1-2 stringi_1.1.5 Rsamtools_1.26.2
[21] XML_3.98-1.6 VariantAnnotation_1.20.3