Transcript ids are not unique in hg19 refseq gene annotation file. I found many ids coming more than once with different start and end on the same chromosome and strand. Does it mean duplicate gene? Do duplicate genes have same id/name?
Transcript ids are not unique in hg19 refseq gene annotation file. I found many ids coming more than once with different start and end on the same chromosome and strand. Does it mean duplicate gene? Do duplicate genes have same id/name?
As far as I can see, some refGenes have been also mapped on the "alternative haplotypes" chromosomes. See http://genome.ucsc.edu/FAQ/FAQdownloads#download10
$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e '
select * from refGene where name="NM_000593" limit 10\G'
*************************** 1. row ***************************
bin: 835
name: NM_000593
chrom: chr6
strand: -
txStart: 32812985
txEnd: 32821748
cdsStart: 32813355
cdsEnd: 32821593
exonCount: 11
exonStarts: 32812985,32814844,32815289,32815695,32816428,32816766,32818096,32818720,32819885,32820164,32820815,
exonEnds: 32813562,32814981,32815452,32815869,32816617,32816895,32818294,32818926,32820016,32820279,32821748,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 2. row ***************************
bin: 616
name: NM_000593
chrom: chr6_apd_hap1
strand: -
txStart: 4099989
txEnd: 4108752
cdsStart: 4100359
cdsEnd: 4108597
exonCount: 11
exonStarts: 4099989,4101848,4102293,4102699,4103432,4103770,4105098,4105722,4106889,4107168,4107819,
exonEnds: 4100566,4101985,4102456,4102873,4103621,4103899,4105296,4105928,4107020,4107283,4108752,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 3. row ***************************
bin: 617
name: NM_000593
chrom: chr6_cox_hap2
strand: -
txStart: 4257513
txEnd: 4266276
cdsStart: 4257883
cdsEnd: 4266121
exonCount: 11
exonStarts: 4257513,4259372,4259817,4260223,4260956,4261294,4262624,4263248,4264413,4264692,4265343,
exonEnds: 4258090,4259509,4259980,4260397,4261145,4261423,4262822,4263454,4264544,4264807,4266276,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 4. row ***************************
bin: 616
name: NM_000593
chrom: chr6_dbb_hap3
strand: -
txStart: 4094363
txEnd: 4103126
cdsStart: 4094733
cdsEnd: 4102971
exonCount: 11
exonStarts: 4094363,4096222,4096667,4097073,4097806,4098144,4099472,4100096,4101263,4101542,4102193,
exonEnds: 4094940,4096359,4096830,4097247,4097995,4098273,4099670,4100302,4101394,4101657,4103126,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 5. row ***************************
bin: 617
name: NM_000593
chrom: chr6_ssto_hap7
strand: -
txStart: 4243758
txEnd: 4252521
cdsStart: 4244128
cdsEnd: 4252366
exonCount: 11
exonStarts: 4243758,4245617,4246062,4246468,4247201,4247539,4248867,4249491,4250658,4250937,4251588,
exonEnds: 4244335,4245754,4246225,4246642,4247390,4247668,4249065,4249697,4250789,4251052,4252521,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 6. row ***************************
bin: 617
name: NM_000593
chrom: chr6_mann_hap4
strand: -
txStart: 4270181
txEnd: 4278944
cdsStart: 4270551
cdsEnd: 4278789
exonCount: 11
exonStarts: 4270181,4272040,4272485,4272891,4273624,4273962,4275292,4275916,4277081,4277360,4278011,
exonEnds: 4270758,4272177,4272648,4273065,4273813,4274091,4275490,4276122,4277212,4277475,4278944,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 7. row ***************************
bin: 616
name: NM_000593
chrom: chr6_mcf_hap5
strand: -
txStart: 4149862
txEnd: 4158625
cdsStart: 4150232
cdsEnd: 4158470
exonCount: 11
exonStarts: 4149862,4151721,4152166,4152572,4153305,4153643,4154971,4155595,4156762,4157041,4157692,
exonEnds: 4150439,4151858,4152329,4152746,4153494,4153772,4155169,4155801,4156893,4157156,4158625,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
*************************** 8. row ***************************
bin: 615
name: NM_000593
chrom: chr6_qbl_hap6
strand: -
txStart: 4045092
txEnd: 4053855
cdsStart: 4045462
cdsEnd: 4053700
exonCount: 11
exonStarts: 4045092,4046951,4047396,4047802,4048535,4048873,4050203,4050827,4051992,4052271,4052922,
exonEnds: 4045669,4047088,4047559,4047976,4048724,4049002,4050401,4051033,4052123,4052386,4053855,
score: 0
name2: TAP1
cdsStartStat: cmpl
cdsEndStat: cmpl
exonFrames: 0,1,0,0,0,0,0,1,2,1,0,
Edit: your NM_012151 has been mapped at multiple locations on chrX because its position is ambiguous: the chrX is full of segmental duplications.
I have posted examples, I don't know if you are not able to see them.
1760 NM_012151 chrX + 154114634 154116336 154114649 154115765 1 154114634, 154116336, 0 F8A1 cmpl cmpl 0,
1764 NM_012151 chrX + 154611748 154613450 154611763 154612879 1 154611748, 154613450, 0 F8A1 cmpl cmpl 0,
1765 NM_012151 chrX - 154686574 154688276 154687145 154688261 1 154686574, 154688276, 0 F8A1 cmpl cmp
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
can you give any example ?