Question

Refseq Gene Entry With Multiple Strand Information [Refgene Table-Mm10]

1

Entering edit mode

12.9 years ago

Sukhi Singh 11k

Hi, can some solve this refseq puzzle for me.

For the gene 0610010B08Rik, on the ucsc browser, it says

RefSeq Gene 0610010B08Rik

RefSeq: NM_001177543.1 Status: Validated

Description: Mus musculus RIKEN cDNA 0610010B08 gene (0610010B08Rik), mRNA.

CCDS: CCDS50826.1

Entrez Gene: 100039060

PubMed on Gene: 0610010B08Rik

PubMed on Product: KRAB box and zinc finger C2H2 type domain containing

Stanford SOURCE: NM_001177543

mRNA/Genomic Alignments

The alignment you clicked on is first in the table below.

BROWSER | SIZE IDENTITY CHROMOSOME  STRAND    START     END              QUERY      START  END  TOTAL
-----------------------------------------------------------------------------------------------------
browser |  4539  100.0%          2     - 175192005 175338212          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 175419391 175435777          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     + 175640391 175656769          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     + 175737942 175754328          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 176470369 176486749          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 176619933 176636319          NM_001177543     1  4539  4539

This alignment information is encoded in to the refgene table (mm10) when you pull it from ucsc which means for the gene 0610010B08Rik, there are 6 entries with the same NM id's (same rna). I always collapse the multiple entries to the one longest entry but in this case, for a same gene, there are entries with different strands. How is this possible.

From the refseq method page, its says

RefSeq RNAs were aligned against the mouse genome using blat; those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.1% of the best and at least 96% base identity with the genomic sequence were kept.

Secondly, for my unique list, which entry should I take.

Thanks

refseq ucsc chip-seq gene • 3.4k views

ADD COMMENT • link 12.9 years ago by Sukhi Singh 11k