hi
I am begginer in Bioinformatics and I have a laptop with OS Ubuntu.
The question is:
How can extract nucleotide frequency for position on aligned sequences in clustal omega?
Thanks
Cèsar
hi
I am begginer in Bioinformatics and I have a laptop with OS Ubuntu.
The question is:
How can extract nucleotide frequency for position on aligned sequences in clustal omega?
Thanks
Cèsar
You could use the seqinr package for R. First read in your alignment with the read.alignment() function then use the consensus() function with method="profile" to get a matrix of counts of each nucleotide at each position.
Another similar option would be with BioPython, to extract a PSSM (Position Specific Score Matrix):
from Bio import SeqIO
from Bio.Align import AlignInfo
align = AlignIO.read('~/path/to/alignment.aln', 'format') # Whatever format your mutliple sequence alignment is in
summary_align = AlignInfo.SummaryInfo(align)
consensus = summary_align.dumb_consensus()
my_pssm = summary_align.pos_specific_score_matrix(consensus,chars_to_ignore = ['N'])
print(my_pssm)
Should end up looking something like:
G A T C
G 1 1 0 1
T 0 0 3 0
A 1 1 0 0
T 0 0 2 0
C 0 0 0 3
Instructions from: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc301
Following method suggested by Healey,
I really like to output the PSSM ouput with frequency in each cell as you mentioned. However, I follow your method above, the method gives me something like: 0 7.0 0 7.0 0 0 0 7.0 in the cells ect ... within the output. It seems to be threshold 70%. But I really want actual frequencies.
Could anyone share how to output the actual frequencies of alleles within each cell, assuming that the left side is position of an alignment?
Highly appreciate you suggestions. Thank you.
This would probably be better as a new question so you can demonstrate your code and input data.
The dumb consensus has a threshold of 0.7 by default (I think), but to my knowledge that doesn’t affect the PSSM; in my example code above the consensus sequence is purely used as a ‘marker sequence’ down the left hand side of the PSSM. There is nothing in the documentation, at a glance, about a default threshold but I may well be wrong (and can’t currently check).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Jean-Karim Heriche
Thanks for your answer. I had installed R and seqinr, but I got a error message.
The file with the alignement is in: home/Documents/A/B/file1.clustal
c@c-HP-245-G4-Notebook-PC:~$
c@c-HP-245-G4-Notebook-PC:~$cd Documents/
c@c-HP-245-G4-Notebook-PC:~/Documents$
c@c-HP-245-G4-Notebook-PC:~/Documents$ cd A/
c@c-HP-245-G4-Notebook-PC:~/Documents/A$
c@c-HP-245-G4-Notebook-PC:~/Documents/A$ cd B/
c@c-HP-245-G4-Notebook-PC:~/Documents/A/B$
c@c-HP-245-G4-Notebook-PC:~/Documents/A/B$ R
R version 3.0.2 (2013-09-25) -- "Frisbee Sailing" Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)
The manual say we have to that write:
I tried save the file in fasta format in the site of clustal omega and change the command > fasta.res <- read.alignment but the result is the same.
can you help me to localizate the error?
Thanks
Cesar
Don't use system.file() here. This is used for files that come with packages. Also note that you may be missing the first / in the file path. In any case, make sure the path is correct. You just need
If you still get an error, check also the file permissions.
Jean-Karim Heriche Thanks for your answer!
I ran the correct command and I can see the alignment in the ubuntu terminal.
read.alignment( file = "/home/Documents/A/B/clustalomega1.clustal", format="clustal")
In the final lines I can read this.
$com
[1] NA
attr(,"class")
[1] "alignment"
For use the function “consensus”, I do not know the use of “matali”
I Ran
consensus(alignment, method = "profile")
Error en inherits(matali, "alignment") : objeto 'alignment' no encontrado
Could you help me to localizate the error?
Thanks Cesar
In general, when programming, one assigns the result of a function to a variable so that it can be reused, e.g. in your case:
Although we do occasionally answer such questions here, this forum is not for programming questions. I would encourage you to first learn how to program and then to direct your programming questions to the Stack Overflow forum.
Thanks for your attention!