Multi Pairwise Sequence Alignment in R
1
1
Entering edit mode
9.6 years ago
mastercod3r ▴ 10

Hello guys, I am final year bioinformatics student, doing internship at the moment. So I am sorry for my noobiness in advance.

Here is what I was told to do, I have 100+ fasta files, and need to run pairwise sequence alignment with each other and save the outcome. Here is what I did and the errors I am getting.

library(seqinr)

setwd("C:\\Users\\Celik\\Downloads\\fasta\\R_deneme")
files = list.files(pattern="*.fasta")
seq = lapply(files, read.fasta)

library(Biostrings)

pws.alignments <- sapply(1:(length(seq)-1), function(ind) {
  sapply((ind+1):length(seq), function(sec.ind) {
    return(pairwiseAlignment(seq[[ind]], seq[[sec.ind]]))
  })
})
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function 'seqtype' for signature '"SeqFastadna"'

I appreciate your help, thanks.

Pairwise-Sequence-Alignment R • 4.6k views
ADD COMMENT
3
Entering edit mode

maybe try readDNAStringSet instead read.fasta ?

ADD REPLY
2
Entering edit mode
9.6 years ago
Brice Sarver ★ 3.8k

Are you doing a true pairwise alignment, i.e., each fasta file has a single sequence? Make sure that seq[[ind]] etc. are of class DNAStringSet. You might need to do a single indexing (seq[ind]) depending on what the lapply returns. Also, the comment about using readDNAStringSet is correct. If not, check Biostrings for MSAs.

Also, alignments in general are quite slow in R, with the exception of simple cases. I would recommend 1) doing this very simply with calls to Muscle or Mafft in a shell script or 2) using a system call in R.

ADD COMMENT
0
Entering edit mode

Hello, thanks for your replies guys.

Yes I need to do a true pairwise alignment, each fasta file has a single sequence. I'll work on readDNAStringset hopefully it will work.

And lastly yes I was many times told that alignment programs would be easier to do this job but my supervisor insists on doing it in R. So yeah gotta do it with R. Thanks again.

ADD REPLY
0
Entering edit mode

Hello Brice, after applying readAAStringSet, it works but partially. I put 5 files in a directory to try out the coding so what it does is

Runs pairwise with file1 vs file2... till file 5 (no problem), but then file2 is till file4, file4 is till f2, and file5 is till file1.

I don't know where is the mistake. here is my update code again. Sorry for bad English. Thanks in advance.

library(seqinr)
library(Biostrings)

setwd("C:\\Users\\Celik\\Downloads\\fasta\\R_deneme")
files = list.files(pattern="*.fasta")
seq = lapply(files, readAAStringSet)
pws.alignments <- sapply(1:(length(seq)), function(ind) {
  sapply((ind):length(seq), function(sec.ind) {
    return(pairwiseAlignment(seq[[ind]], seq[[sec.ind]]))
  })
})
ADD REPLY

Login before adding your answer.

Traffic: 5036 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6