Mouse Sequence Compared To Human Sequence
3
2
Entering edit mode
13.9 years ago
Rodney ▴ 20

I have a human sequence surrounding a SNP in a gene that I want to compare with the mouse gene sequence. The human genome draft (UCSC) says this area is homologous with mouse. But when I blast it on NCBI, the mouse sequence area around and including the SNP will not come up even when I try to relax the stringency for mismatches. It will bring up the mouse sequence about 200 bp upstream of the SNP. Not sure of the discrepancy between the two websites. Is there any other tool where I can visualize and compare DNA sequences between mouse and man, no matter how discordant they are?

Thanks for your assistance in this matter.

sequence mouse human • 7.3k views
ADD COMMENT
3
Entering edit mode
13.9 years ago

One problem here is that "homology" is relative, and individual conserved regions of synteny between mouse and human are often broken into pieces or only partial. Quick and dirty example, since you didn't tell us your SNP:

Take rs2456449, which is in 8q24. Drop that into the UCSC genome browser; it returns the region chr8:128,192,731-128,193,231. Zoom in 10x and click DNA on the header and press the Get DNA button. Copy the DNA returned (TAAGTTTGCGCCTGGTGAAAAAAAAAAAAAACAATATATGCACATGTGCA, 50 nt), Click the BLAT link, and BLAT the sequence against the mouse genome. This returns three results; one is broken into two pieces, and two partial perfect matches 22 nucleotides long. However, I don't think any of those is the proper syntenic region.

Now drop that sequence into MegaBLAST with default settings; you don't see any matches. BLAT is not BLAST; it works off of 11-mers. I think BLAST doesn't pick up these loci because the short matching regions are only 1/2 the size of my total query sequence, and the first match is broken up by a large interval. In any case, not very helpful.

Head back to the genome browser and turn on the Mammal conservation track, which is probably the right approach from the start. Zoom in and look at the mouse track; you'll see the mouse DNA sequence ----attgt-atctcatg-----caaaacagtgagtttgttcaagtca-aaagcatctatgcacacagac mapped to that interval 128192956 - 128193005 on human. Pull out the dashes from the mouse sequence and blat that on the mouse genome; you get a perfect match on chromosome 15, which is the proper syntenic region.

ADD COMMENT
1
Entering edit mode

My intended meaning was that while the concept of homology has a specific meaning, once which I think we both understand, the specific degree of sequence conservation as a result of shared evolutionary history for a given gene is variable. Perhaps my pun was a bit obscure, but what is homology if not an assertion that two species are relatives?

ADD REPLY
0
Entering edit mode

This needs to be added as a homework problem for anyone taking a bioinformatics course

ADD REPLY
0
Entering edit mode

@David: saying that "homology is relative" is quite ambiguous. It has a specific meaning in biology, which I do not find "relative". Proving that two sequences are homologous, well, I agree that can be arguable sometimes ;)

ADD REPLY
0
Entering edit mode

I agree with you there, but obviously the person asking the question is confused about the meaning, so making a pun out of it will probably not help him...

ADD REPLY
1
Entering edit mode
13.9 years ago
Mark Evans ▴ 50

Hi Rodney,

Not sure if this is the best solution, but if you know the corresponding area on the mouse genome where you are expecting it to hit, you could grab that DNA and try a pairwise BLAST rather than a general BLAST. This should at least let you see the discordance specific between your sequences in that region.

Mark

ADD COMMENT
0
Entering edit mode

This can be an excellent solution to the problem.

ADD REPLY
1
Entering edit mode
13.9 years ago
Nicojo ★ 1.1k

First up: as a simplification, homology means that the trait in question has a common origin. It certainly does not mean that the sequences are the same.

Second: BLAST is the acronym for Basic Local Alignment Search Tool. It is "basic" and "local". This usually results in that the sequence you're interested in (your query) will not be returned aligned over it's full length with the sequences found in the database you searched, but will only be returned as local alignments for the best matching parts.

According to what you say, it seems obvious that the region you're interested in is not the most conserved part of that gene between human and mouse.

My suggestions are as follow:

  • When you find vocabulary that you're not sure you understand, look it up.
  • When you use tools: be sure you understand what they do and how to use them.

Regarding your problem at hand:

  • Retrieve the genes you're interested in from the species you want to compare (human and mouse in your case, you can probably find these in Genbank or from other sources)
  • Be aware that there may be several variants of these genes within each species (as I understand you are looking at a polymorphic site)
  • Use a pairwise or multiple sequence alignment tool to align them
  • Look within the alignment for the specific region you're interested in

NOTE: I'm assuming you're interested in a specific gene or coding region. If you're looking at an intergenic region, you can follow the same procedures, except that the retrieval process may be a bit more challenging.

Good luck!

ADD COMMENT

Login before adding your answer.

Traffic: 1758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6