Entering edit mode
7 months ago
curious
▴
820
I am looking at this file:
I see there are transcript variant #'s
eg:
>NM_000014.6 Homo sapiens alpha-2-macroglobulin (A2M), **transcript variant 1**, mRNA
What do these correspond to? It does not seem to be the longest transcript. Is it the canonical transcript?
I spot checked BRCA1 and BRCA2
>NM_007294.4 Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript variant 1, mRNA
>NM_000059.4 Homo sapiens BRCA2 DNA repair associated (BRCA2), transcript variant 1, mRNA
The NMs seem to match the ensembl canonical transcripts:
https://useast.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000139618;r=13:32315086-32400268 https://useast.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000012048;r=17:43044295-43170245
Those are just different transcripts annotated in RefSeq for that particular gene.
Here is the example for gene you posted: https://www.ncbi.nlm.nih.gov/datasets/gene/id/2/products/
Click on
Select Column
button and addtranscript name
to the table to see the complete names.If you are interested in
one transcript per gene
then take a look at the MANE project: https://www.ncbi.nlm.nih.gov/refseq/MANE/Longest == canonical. If you're looking for the transcript deemed to be most relevant by researchers, look at MANE as suggested by GenoMax. MANE Select is an even better annotation for these transcripts.
With genes like BRCA1 and ESR1, you'll find that the MANE transcript does not match the canonical transcript and the common mutations you observe with match those annotated on the MANE transcripts.