Question

Pairwise Sequence Alignment With R

5

Entering edit mode

13.6 years ago

Lakshmi ▴ 60

Hi all,

I used the following R program for pairwise sequence alignment. My data set contains 90 protein sequences. I would like to do pairwise sequence alignment with each pair of sequences.

library("seqinr")
seq1<- read.fasta(file = " first.fasta")
seq2<- read.fasta(file = " second.fasta")
seq1string <- toupper(c2s(seq1[[1]]))
seq2string <- toupper(c2s(seq2 [[2]]))
library(Biostrings)
globalAlign<- pairwiseAlignment(seq1string, seq2string)
globalAlign 
pid(globalAlign, type = "PID3")

file second.fasta is my dataset. first.fasta file contains first sequence . Using this program I am doing pairwise sequence alignment with first sequence and second sequence.Next, I have to do alignment with first and third sequence. first and 4th sequence etc upto 90. What all changes do I have to make in my program for visualizing the alignment of each pair of sequences?

alignment r • 20k views

ADD COMMENT • link updated 5.3 years ago by gjhansi111 • 0 • written 13.6 years ago by Lakshmi ▴ 60

5

Entering edit mode

put a loop around... what else?

ADD REPLY • link 13.6 years ago by Michael 56k

0

Entering edit mode

Hello, I am trying to align 50 sequences with one sequence and find percent identity, i.e seq1 vs seq2, seq1 vs seq2 etc.

i have tried the above code, @David W x <- sapply(seq2,function(seq1) pairwiseAlignment(seq2,seq1,type ="global",substitutionMatrix = mat, gapOpening = 10, gapExtension = 1))

I could get the alignment score but when i am trying to get percent identity it gives me the following error, pid(x, type = "PID4")

   Error:         Error in (function (classes, fdef, mtable)  : 
                    unable to find an inherited method for function ‘pid’ for signature ‘"list"’

The error may be because sapply returns list, but "pid" takes pariwiseAlignmentsingle subject.

Thank you

ADD REPLY • link 5.3 years ago by gjhansi111 • 0

0

Entering edit mode

Hi gjhansi111,

I am having the same problem - do you have a solution for this?

Many thanks

Tom

ADD REPLY • link 5.2 years ago by tom.lewin1 • 0

score 5 · Answer 1 · 2011-12-23

Hi Laksmi,

It's not quite clear from your questoin, but do you want to do a pairwise alignment of each of your 90 sequences against a particular sequence (ie seq2[[1]] v seq1 then seq2[[2]] v seq1 in your example) or you want to do all the possible pairwsie comparisons between your 90 sequences.

The first one is easy, use an apply function. I don't have bioconductor on this computer, so this isn't tested, but something like

sapply( seq2, function(x) pairwiseAlignment(toupper(c2s(x)), seq1string)) )

It's probably more readible if you define a function first:

convert_then_align <- function(a,ref_seq){
  seq_string <- toupper(c2s(a))
  return(pairwiseAlignment(seq_string, ref_seq))
}

sapply(seq2, convert_then_align, seq2string)

All possible combinations is a little tricker, the way I do these is to make the indices first:

all_pairs <- combn(1:length(seq2), 2)

Then you need a function to do your converting and aligning

align_from_index <- function(sequences, index.a, index.b){
  seq1 <- toupper(c2s(sequences[index.a]))
  seq2 <- toupper(c2s(sequences[index.b]))
  return( pairwiseAlignment(seq1, seq2) )
}

res <- apply(all_pairs, 2, function(indices) align_from_index(sequences, indices[1], indices[2]) )

Which you can turn into a matrix if pairwiseAlignment returns something that makes sense for that (there must be a less hacky way of doing this!):

attributes(res) <- attributes(dist(1:length(seq2))
res <- as.matrix(res)