Distinguishing coding DNA using NCBI and UCSC databases
1
1
Entering edit mode
7.0 years ago
pairwiseseq ▴ 10

When I input the word titin in UCSC genome I get multiple relevant hits as shown here. My interest in defining the DNA coding region of the titin gene. Specifically when I click on one of my search hits

TTN (uc021vtb.4) at chr2:178525989-178807423 - Homo sapiens titin (TTN), transcript variant N2-B, mRNA. (from RefSeq NM_003319)

and view the DNA, it is positioned at chr2:178,525,989-178,807,423. My guess is that this is the DNA coding strand.

Now my confusion comes when I refer to the same titin gene from NCBI RefSeq. The sequence is chr2:178525989..178807423, complement. And when I look at the exact NCBI's FASTA sequence, it is the reverse complement of that from UCSC Genome. So which is the correct coding DNA?

sequence gene • 1.4k views
ADD COMMENT
1
Entering edit mode
7.0 years ago
genecats.ucsc ▴ 580

We always store our coordinates in relation to the positive strand, because it makes coordinate math easier to deal with. When you click on the transcript in question, you can see the following information: Position: hg38 chr2:178,525,989-178,807,423 Size: 281,435 Total Exon Count: 191 Strand: -

The position here indicates only the genomic range of the item, and has nothing to do with the strandedness of the transcript.

When you choose the DNA output link from the details page for your transcript, the DNA returned is reverse complemented. You can verify this by taking the first set of capitalized bases (the first exon) and BLATting the sequence against hg38, where you can see that the first exon is at the "rightmost" end of the transcript:
http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=chmalee&hgS_otherUserSessionName=hg38_ttn_first_exon

For more information on our coordinate system, please see the following blog post:
http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/

Lastly, when you share links to the Genome Browser do not include the "hgsid" in the URL, as the hgsid indicates session specific information to our servers, and thus anyone who uses that hgsid and browses to a new location, or makes custom tracks, etc will cause the link you sent to not be what you intend. If you want to create a shareable link, please see our Session documentation:
http://genome.ucsc.edu/goldenPath/help/hgSessionHelp.html

If you have further questions about UCSC data or tools feel free to send your question to one of the below mailing lists:

  • General questions: genome@soe.ucsc.edu
  • Questions involving private data: genome-www@soe.ucsc.edu
  • Questions involving mirror sites: genome-mirror@ose.ucsc.edu

ChrisL from the UCSC Genome Browser

ADD COMMENT

Login before adding your answer.

Traffic: 1352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6