Hi there,
I try to write an R-script with which I can align a bench of sequences in one file with a single sequence in another file.
So far, I'm really happy with the results, but I have one big problem. How to write the pairwiseAlignment output in fasta format WITH header names?
I did this (among others):
seq1 <- readDNAStringSet("file1.fasta", use.names=T)
seq2 <- readDNAStringSet("file2.fasta", use.names=T)
mat <- nucleotideSubstitutionMatrix(match = 1, mismatch = -3, baseOnly= F)
# here I will introduce a for-loop to align every sequence against the sequence in file2
globalAlign <- pairwiseAlignment(seq1[1], seq2[1], type='global-local', substitutionMatrix=mat, gapOpening=10, gapExtension=-5)
r = BStringSet( c(toString(subject(globalAlign)), toString(pattern(globalAlign))) )
writeXStringSet(r,"out1.txt")
But my "out1.txt"-outfile looks like this:
>
ATGCGATGCTAGCTGCATAGCTCGATCG
>
ATGCGAT---AGCTGCATAGCT---TCG
Has anyone of you an idea how to include the sequence names so that it will look like this:
>seq_name_1
ATGCGATGCTAGCTGCATAGCTCGATCG
>seq_name_2
ATGCGAT---AGCTGCATAGCT---TCG
Many thanks in advance!