How to get orthologous sequence
2
0
Entering edit mode
6.2 years ago
L. A. Liggett ▴ 130

Is there a nice way to get orthologous sequence of a particular gene locus? As an example of what I'm trying to do:

Say I have the gene CTNNB1 that is commonly mutated in humans at chr3:41266124. Now, I would like to convert this location to the mouse genome and get the surrounding gDNA sequence.

At UCSD genome browser I can see the mouse conservation track will show amino acid sequence but not DNA sequence but this is the only thing I can think of to find the information I need.

genome gene • 1.6k views
ADD COMMENT
0
Entering edit mode

Why don't you look up the ortholog ID in the mouse genome, and then extract your sequence based in it's location?

ADD REPLY
0
Entering edit mode

This is what I don't know how to do. I can see how the mouse gene can be accessed, but then how to target a particular hotspot within that region and identify its coordinates?

ADD REPLY
1
Entering edit mode
6.2 years ago

Perhaps a crude approach but might work I think:

since you have both the human gene and it's homologous mouse gene, you can first look up the specific region you mention in the human gene (chr3:41266124) and mark it in the sequence itself (with Ns Xs ?...) then you align the human gene (with your position flag) with the mouse homolog (ortholog?) and then you can 'liftover' the marked position in the human gene to the mouse gene based on the alignment of the two genes.

ADD COMMENT
0
Entering edit mode

This actually works pretty well. I grabbed a targeted region of sequence from UCSC with something like this:

wget -O - http://genome.ucsc.edu/cgi-bin/das/hg38/dna?segment=%s:%s,%s >> locs\n' % (chrom, low, high)

Then used this in UCSC BLAT to find the orthologous region within mouse. (Though I'm not sure how to programmatically interact with BLAT.

ADD REPLY
0
Entering edit mode
6.2 years ago
GenoMax 148k
HomoloGene ID   Common Organism Name    NCBI Taxon ID   Symbol  EntrezGene ID   Mouse MGI ID    HGNC ID OMIM Gene ID    Genetic Location    Genomic Coordinates (mouse: , human: )  Nucleotide RefSeq IDs   Protein RefSeq IDs  SWISS_PROT IDs
1434    mouse, laboratory   10090   Ctnnb1  12387   MGI:88276           Chr9 72.19 cM   Chr9:120929216-120960507(+) NM_001165902,NM_007614  NP_031640,NP_001159374  Q02248
1434    human   9606    CTNNB1  1499        HGNC:2514   OMIM:116806 Chr3 p22.1  Chr3:41199451-41240448(+)   NM_001098209,NM_001098210,NM_001330729,NM_001904    NP_001091679,NP_001317658,XP_006713048,XP_024309125,NP_001895,XP_024309128,XP_006713046,NP_001091680,XP_016861227,XP_024309126,XP_024309124,XP_024309127    P35222

List of human/mouse homologs from Jax Informatics.

ADD COMMENT
0
Entering edit mode

This is an interesting resource, but I'm still not quite sure how this would help me target a particular sequence. The reference you point to seems to do a nice job of identifying the entire gene region, but how would I target a particular hotspot within the given gene?

ADD REPLY
0
Entering edit mode

You now have location of the homologous gene in mouse. You need to do some work to figure out how the mutation in human gene would map to corresponding mouse gene location.

ADD REPLY

Login before adding your answer.

Traffic: 2236 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6