That XML sheet doesn't seem to translate it in the correct reading frame. Based on the UCSC genome browser, the correct amino acid sequence for mm10 chr3:93396405,93396500 is QDSPHRGQK...
I'm trying to get the amino acid sequence that displays in the UCSC genome browser itself.
so if you want to know the amino acid translated for the gene at this position, the stylehsheet won't work. You should work with the knownGene table to get the position of the exon .
So are you saying that there is no direct way to download the amino acid sequence for a given region? That's hard to believe, considering that it is displayed in the genome browser.
That's all I'm looking for. A direct way to download the amino acid sequence for a given region. I'd rather avoid MySQL table lookups.
Can you explain what you're trying to do? It doesn't make a lot of sense to try to get an amino acid for a random piece of DNA. If you need the amino acid sequence, you usually first click on a transcript. The resulting page has a link for the protein sequence.
Oh, I start to understand: you're looking at an exon. You can see the amino acid sequence shown on the screen. But if you click on the exon, all you can get is the full amino acid sequence of the whole transcript, not the little piece that you have on the screen.
Can you still explain a little bit more what the final point of this would be? I struggle with finding a use case for this, where this particular function could be useful...
I'm a rising college freshman with very little bioinformatics experience (although I do have significant programming experience), so please bear with me.
I have a spreadsheet of single nucleotide mutations at certain positions in the mm10 genome (e.g. chr11:3133305 C->A). I'm trying to determine whether those mutations: 1) occur in a coding region of the genome, 2) yield a change in amino acid, and 3) determine what that change is.
The process by which I'm thinking of accomplishing that is this: download the nucleotide and amino acid sequence in a small range around the mutation, determine the reading frame of the DNA sequence, and from there determine the codon that the mutation occurs in. From there, it is trivial to find the change in amino acid resulting from the mutation.
I already have a Java program written that accomplishes the above given the nucleotide and amino acid sequence - I just need a way to download these sequences for a given region.
As I said earlier, I do have very little bioinformatics experience, so I would welcome suggestions for better ways to accomplish my goal.
That seems perfect, thanks a lot! I'll definitely use it.
For the sake of having the question answered, is there a way to download the amino acid sequence for a region in plain text? I'd hate for someone who needs that and arrives at this thread to not find an answer.
There is nothing I know of. You can click on a transcript and get the full amino acid sequence but not the current slice in view, at least not that I know...
I have a spreadsheet of single nucleotide mutations at certain positions in the mm10 genome (e.g. chr11:3133305 C->A). I'm trying to determine whether those mutations: 1) occur in a coding region of the genome, 2) yield a change in amino acid, and 3) determine what that change is
That XML sheet doesn't seem to translate it in the correct reading frame. Based on the UCSC genome browser, the correct amino acid sequence for mm10 chr3:93396405,93396500 is QDSPHRGQK...
I'm trying to get the amino acid sequence that displays in the UCSC genome browser itself.
That's because the DAS segment starts with 93396406:
so if you want to know the amino acid translated for the gene at this position, the stylehsheet won't work. You should work with the knownGene table to get the position of the exon .
for example: Is A Genome Position In An Exon Or Intron?
So are you saying that there is no direct way to download the amino acid sequence for a given region? That's hard to believe, considering that it is displayed in the genome browser.
That's all I'm looking for. A direct way to download the amino acid sequence for a given region. I'd rather avoid MySQL table lookups.