Making customized DB using customProDB (R package)
1
0
Entering edit mode
7.5 years ago
d88020 ▴ 20

I'm trying to make customized DB using customProDB. I have tried to splice junction analysis and made annotation files (splicemax, ids, etc.) thanks to tutorial. However, OutputNovelJun function made error. I copied and paseted code and message below.

> library(customProDB)

필요한 패키지를 로딩중입니다: IRanges

필요한 패키지를 로딩중입니다: BiocGenerics

필요한 패키지를 로딩중입니다: parallel

다음의 패키지를 부착합니다: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’: IQR, mad, xtabs

The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min

필요한 패키지를 로딩중입니다: S4Vectors

필요한 패키지를 로딩중입니다: stats4

다음의 패키지를 부착합니다: ‘S4Vectors’

The following objects are masked from ‘package:base’: colMeans, colSums, expand.grid, rowMeans, rowSums

필요한 패키지를 로딩중입니다: AnnotationDbi

필요한 패키지를 로딩중입니다: Biobase

Welcome to Bioconductor

Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.

필요한 패키지를 로딩중입니다: biomaRt

Gene2RefSeq <- read.csv("C:/Users/Admin/Desktop/CustomizedDB/Gene2RefSeq_Parsed.txt", sep = "\t")
pepfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/pepfasta.fasta"
CDSfasta <- "C:/Users/Admin/Desktop/CustomizedDB/customProDB/CDSfasta.fasta"
setwd("C:/Users/Admin/Desktop/CustomizedDB/example")
annotation_path <- getwd()
transcript_ids <- as.matrix(Gene2RefSeq[ , 2])
transcript_ids <- transcript_ids[1:1000]
PrepareAnnotationRefseq(genome='hg19', CDSfasta, pepfasta, annotation_path, transcript_ids = transcript_ids, splice_matrix = TRUE)

Build TranscriptDB object (txdb.sqlite) ...

Download the refGene table ... OK

Download the hgFixed.refLink table ... OK

Extract the 'transcripts' data frame ... OK

Extract the 'splicings' data frame ... OK

Download and preprocess the 'chrominfo' data frame ... OK

Prepare the 'metadata' data frame ... OK

Make the TxDb object ... OK

done

Prepare gene/transcript/protein id mapping information (ids.RData) ... done

Prepare exon annotation information (exon_anno.RData) ... done

Prepare protein sequence (proseq.RData) ... done

Prepare protein coding sequence (procodingseq.RData)... done

Prepare exon splice information (splicemax.RData) ... done

There were 16 warnings (use warnings() to see them)

load("splicemax.RData")
load("ids.RData")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)

필요한 패키지를 로딩중입니다: GenomicFeatures

필요한 패키지를 로딩중입니다: GenomeInfoDb

필요한 패키지를 로딩중입니다: GenomicRanges

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
bedfile <- "C:/Users/Admin/Desktop/CorrectedSpliceJunctions_Colon.bed"
jun <- Bed2Range(bedfile,skip=1,covfilter=5)
junction_type <- JunctionType(jun, splicemax, txdb, ids)
outf_junc <- paste(getwd(), '/test_junc.fasta',sep='')
library('BSgenome.Hsapiens.UCSC.hg19')

필요한 패키지를 로딩중입니다: BSgenome

필요한 패키지를 로딩중입니다: Biostrings

필요한 패키지를 로딩중입니다: XVector

필요한 패키지를 로딩중입니다: rtracklayer

proteinseq <- read.csv2("C:/Users/Admin/Desktop/CustomizedDB/proteinseq.txt", sep = "\t")
OutputNovelJun <- OutputNovelJun(junction_type, Hsapiens, outf_junc, proteinseq)

Error in loadFUN(x, seqname, ranges) :

trying to load regions beyond the boundaries of non-circular sequence "chr17"

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Korean_Korea.949  LC_CTYPE=Korean_Korea.949    LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C                
[5] LC_TIME=Korean_Korea.949    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0       BSgenome_1.42.0                        
 [3] rtracklayer_1.34.2                      Biostrings_2.42.1                      
 [5] XVector_0.14.1                          TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [7] GenomicFeatures_1.26.4                  GenomicRanges_1.26.4                   
 [9] GenomeInfoDb_1.10.3                     customProDB_1.14.1                     
[11] biomaRt_2.30.0                          AnnotationDbi_1.36.2                   
[13] Biobase_2.34.0                          IRanges_2.8.2                          
[15] S4Vectors_0.12.2                        BiocGenerics_0.20.0                    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10               magrittr_1.5               zlibbioc_1.20.0            GenomicAlignments_1.10.1  
 [5] BiocParallel_1.8.2         lattice_0.20-35            plyr_1.8.4                 stringr_1.2.0             
 [9] tools_3.3.3                grid_3.3.3                 SummarizedExperiment_1.4.0 DBI_0.6-1                 
[13] digest_0.6.12              Matrix_1.2-8               bitops_1.0-6               RCurl_1.95-4.8            
[17] memoise_1.1.0              RSQLite_1.1-2              stringi_1.1.5              Rsamtools_1.26.2          
[21] XML_3.98-1.6               VariantAnnotation_1.20.3
R • 2.3k views
ADD COMMENT
0
Entering edit mode
7.5 years ago

You should be loading the proteinseq from your annotation directory, i.e.

load("C:/Users/Admin/Desktop/CustomizedDB/example/proseq.RData")

That will load the proteinseq annotations into your environment. Then you can pass that to OutputNovelJun. It's one of the trickier functions to get working so I wouldn't be surprised if you run into more issues with it after this one is fixed.

ADD COMMENT

Login before adding your answer.

Traffic: 1608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6