R/Biostring: Print Sequence Alignment To File
1
2
Entering edit mode
12.6 years ago
Emempe ▴ 50

I do a simple pairwise DNA sequence alignment with pairwiseAlignment from the Biostrings package in Bioconductor:

library('Biostrings')
seq1 = 'ATGCTA'
seq2 = 'ATGTA'
pairwiseAlignment(pattern = seq1, subject = seq2)

The output looks as follows:

Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [1] ATGCTA 
subject: [1] ATG-TA 
score: -4.091219

For very long sequences, the output is truncated and only one line is shown:

Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [1] AT-----------------------------...----------------TGTCTTCCAKATCTGGCGCGCCTGGGTTGATATC 
subject: [1] ATTGGCGGCCGCGCCACCATGCCAGAGCCAG...GAAGGCTGTATGCTGTTGTCTTCAAGATCTGGTACCGCTGGGTTGATATC 
score: -29418.8

How can I output the complete alignment to a text file?

bioconductor sequence alignment • 5.8k views
ADD COMMENT
0
Entering edit mode

Looking briefly to your long alignment shows an AT base pair much closer to the rest of the sequence. In fact alignment start at 1 for each sequence. Question to anyone: how to avoid such a suboptimal alignment ?

ADD REPLY
2
Entering edit mode
12.6 years ago

You can get to the "text representation" of the pairwise alignment by using the pattern and subject accessors, then convert these to normal character object that you can write "as usual" eg:

library('Biostrings')
seq1 <- 'ATGCTA'
seq2 <- 'ATGTA'
pa <- pairwiseAlignment(pattern = seq1, subject = seq2)

as.character(pattern(pa))
[1] "ATGCTA"

as.character(subject(pa))
[1] "ATG-TA"

Not sure how you want to write it out, but there's (at least) a start.

ADD COMMENT
0
Entering edit mode

Thanks! I think I have to write an output function myself ... one would have to put it in blocks of 60, 80 or whatever nucleotides.

ADD REPLY

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6