VariantAnnotation parallel usage
0
1
Entering edit mode
10.0 years ago
russhh 5.7k

Hi, I'm trying to run a bioconductor variant annotation workflow, similar to that described here but I'm having trouble running my code in parallel over data from multiple patients' samples.

I've narrowed the problem down to that shown below; effectively I can't call locateVariants() within an mclapply call, presumably because the calls try to use the same database handle in parallel (apologies If I've misunderstood the problem, and indeed if it's trivial). I was wondering whether there is a simple way to implement this in parallel within R.

## testcode
library('VariantAnnotation')
library('TxDb.Hsapiens.UCSC.hg19.knownGene')
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
si <- Seqinfo(
  seqnames = names(genome(txdb)),
  genome = genome(txdb))
pos.pt1 <- GRanges(
  seqnames = c('chr1', 'chr2', 'chr3'),
  ranges = IRanges(start = rep(10^6, 3), end = rep(10^6, 3)),
  seqinfo = si
  )
pos.pt2 <- GRanges(
  seqnames = c('chr11', 'chr12', 'chr13'),
  ranges = IRanges(start = rep(10^6, 3),
  end = rep(10^6, 3)),
  seqinfo = si
  )
pos.list <- list(pt1 = pos.pt1, pt2 = pos.pt2)

lapply(pos.list, function(gr){
  locateVariants(gr, txdb, AllVariants())
  }) # runs fine, aside from a couple of warnings

mclapply(pos.list, function(gr){
  locateVariants(gr, txdb, AllVariants())
  }) # fails with error:

Warning message:
In mclapply(pos.list, function(gr) { :
  scheduled core 1 encountered error in user code, all values of the job will be affected

and result (the result for patient 2 was as in the lapply version)..

$pt1
[1] "Error in sqliteFetch(rs, n = -1) : \n  rsqlite_query_fetch: failed: database disk image is malformed\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in sqliteFetch(rs, n = -1): rsqlite_query_fetch: failed: database disk image is malformed>

All the best

Russ
Liverpool

parallel bioconductor R • 2.4k views
ADD COMMENT
0
Entering edit mode

FYI: don't use sqlite3 with NFS: http://www.sqlite.org/faq.html

But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations.

ADD REPLY

Login before adding your answer.

Traffic: 1182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6