I am attempting to work through the megaptera vignette walk through. The initial parts including database setup, feeding the database a taxonomic backbone [stepA(x)]
, and fetching the sequences for desired loci of given taxa [stepB(x)]
work fine. Most of the stepC(x)
is run given the print out in R, but I keep getting an error message pertaining to the mafft()
call for alignment of conspecific sequences (see bottom line of code). When I call ips::mafft()
directly the function works fine. My question is - what I can try to resolve the received error message?
>library(devtools); install_github("heibl/megaptera")
>library("megaptera")
># set up the postgresSQL database connection
>drv <- dbDriver("PostgreSQL")
>conn <- dbPars(dbname = "Cetacea", host = "localhost", port = 5432, user = "openpg", password = "new_user_password")
>show(conn)
###PostgreSQL connection parameters:
### host = localhost
### port = 5432
### dbname = cetacea
### user = openpg
### password = new_user_password
# create taxonomic backbone
>tax <- taxon(ingroup = "Cetacea",
outgroup = c("Sus scrofa"),
kingdom = "Metazoa")
>tax
###--- megaptera taxon class ---
###ingroup : Cetacea
###is extended : no
###outgroup : Sus scrofa
###is extended : no
###in kingdom : Metazoa
###hybrids : excluded
###guide tree : taxonomy-based
# set the gene loci of interest
>loci <- locus("cox1")
>loci
###Locus definition for cox1
###kind : gene
###search strings : cox1, COI, COX1, coi, Cox1
###search fields : gene, title
###use genomes : TRUE
###SQL tables : acc_cox1, spec_cox1
###alignment method : auto
###minimum identity : 0.75
###minimum coverage : 0.5
# set the function parameters
>pars <- megapteraPars()
>pars
MEGAPTERA pipeline parameters:
### parallel = FALSE
### cpus = 0
### cluster.type = none
### update.seqs = all
### retmax = 500
### max.gi.per.spec = 100
### max.bp = 5000
### reference.max.dist = 0.25
### min.seqs.reference = 10
### fract.miss = 0.25
### filter1 = 0.5
### filter2 = 0.25
### filter3 = 0.05
### filter4 = 0.2
### block.max.dist = 0.5
### min.n.seq = 5
### gb1 = 0.5
### gb2 = 0.5
### gb3 = 9999
### gb4 = 2
### gb5 = a
# define x to pass to step() functions
>x <- megapteraProj(db = conn,
taxon = tax,
locus = loci,
align.exe = "C:/Users/Gregory/Programs/MAFFT/mafft",
mask.exe = "C:/Users/Gregory/Programs/GBlocks/Gblocks")
# begin the pipeline with stepA
> stepA(x)
###megaptera 1.0-52
###2017-03-21 18:29:15
###STEP A: searching and downloading taxonomy from GenBank
###taxonomy already downloaded
###STEP A finished after 0.06 secs
# move on to stepB
>stepB(x)
###megaptera 1.0-52
###2017-03-21 18:30:22
###STEP B: searching and downloading sequences from GenBank
###...
# than stepC
>stepC(x)
###megaptera 1.0-52
###2017-03-21 18:31:43
###STEP C: alignment of conspecific sequences
### 72 species in table acc_cox1
### 8 species have 1 accession
### 64 species have > 1 accession
### 63 species are already aligned
### 1 species need to be aligned
###-- 9 seqs. of Delphinapterus_leucas
###**Error in mafft(seqs, method = "auto", path = megProj@align.exe) :**
### **unused argument (path = megProj@align.exe)**
I believe this may be an issue with the
alignSpecies()
function that is included towards the end ofstepC(x)
. This function includes an external call to theips::mafft()
alignment wrapper, but doesn't seem to implement it correctly. Potentially this is because I'm using a Windows computer?