I downloaded an alignment of 35 mammals. I selected a random fragment. In the case of mouse it corresponded to this section:
>mus_musculus_2_9117025_9127689_-1__chr_length=182113224_
TTATTTTGGATAAATACATTAAAAATTTAAATTTAGTTATTGTTAGTACTTGGATAAGTAGGAT
When I look for that sequence (just the first line of it) using Ensembl BLAST it finds it (despite not in that position) but when I download the data for that region (whether I use ensembl of ncbi) the sequence does not correspond. I then downloaded the full chromosome 2 for the mouse reference genome, then made a search of that sequence and it doesn't appear anywhere. Not even close. What am I missing?
I need that because I want to extract an alignment for that section from other species. Indeed I tried to find a local alignment for the full sequence but the result was terrible. Then I tried with the mouse and same happened. Then I realized that indeed that sequence used in the 35 mammals alignment apparently does not exist in the mouse gemome, despite blast also finds it ... I am lost. Any help would be appreciated.
This sequence do exist in the mouse genome:
How did you search the sequence?
How To Ask Good Questions On Technical And Scientific Forums
Hi h.mon As I said, I already found that sequence using BLAST. That same screen that you posted. The problem is when I try to download that sequence and the surrounding area. This sequence is not what I get if use these positions as parameters in the ensembl browser. What I get is that:
As I also said, I downloaded the full chromosome 2 sequence from an ftp site (indeed two versions from two ftp sites) and made a search using a text editor on it and that sequence does not appear. Sorry if I am not accurate with the terminology, I am a newbie on bioinformatics, but I already have large experience on asking technical questions on other technical fields. I don't see any problem in the question but if there is one please share your impressions.
Just be more detailed. First and foremost, you should have showed the sequence you found.
But you also should have said from the beginning how you downloaded the particular region (using Ensembl BioMart? etc), how you searched for the sequence, if you used used local blast, or NCBI (or Ensembl) blast server, and so on. Generally speaking, it is also a good idea to paste the exact commands you used.
When you do this, people have more information and is able to provide more detailed, higher quality suggestions. For example, although very tempting (I myself do this), searching for a pattern in a fasta file is not advisable, because line wrapping can result in false negatives. I would have advised you to perform a local blast search against the downloaded chr2, or to use BBDuk from the BBTools package:
Both programs handle searching in both strands, so one finds patterns on the opposite strand, which are not automatically searched when using an text editor.