Question

HUGO and Ensembl ids

0

Entering edit mode

6.8 years ago

rsafavi ▴ 60

if we have "genes with the same HUGO ids but different Ensembl id" does it make sense to add up the raw count of those? ( for RNA expression or single cell analysis). Does it make sense to treat them as isoforms?

RNA-Seq single cell hugo Ensembl • 2.7k views

ADD COMMENT • link updated 6.7 years ago by EagleEye 7.6k • written 6.8 years ago by rsafavi ▴ 60

0

Entering edit mode

Please provide some examples.

ADD REPLY • link 6.8 years ago by EagleEye 7.6k

0

Entering edit mode

Maybe I can give an example. I have RNAseq samples from human and I have differential expression of gene NDUFA6 like below:

ENS ID                        Gene Name.          logFC
ENSG00000272765.   NDUFA6.               -0.6
ENSG00000281013.   NDUFA6                0.8
ENSG00000184983.   NDUFA6.               -0.6

As you can see, there are different ENS IDs (2 alternative sequence alignments and the last one is reference gene at the Ensembl website) for the same gene name. Usually, I do not get such different FCs for the alternative versions of the same gene but now it gets tricky. Should I integrate all alignments of the same gene name into one gene expression (for all such cases) and make the DE analysis again? Or how should I interpret this results?

ADD REPLY • link updated 6.7 years ago by finswimmer 16k • written 6.7 years ago by turkulerc • 0

score 3 · Answer 1 · 2018-11-13

Hi,

Only last one is from chromosome 22 assembly. The first 2 are from the exception contigs (haplotype variant contigs). So there is a reason to assign different 'ENSG' names for them. I would say always use 'ENSG' ids as reference/indexing purpose, when you are working with ensembl annotation. In my opinion, if you are working with gene-level analysis, you always summarize based on 'ENSG' ids. For transcript-level/isoform-level analysis, you always summarize based on 'ENST' ids

Note:

ENSG: Genes ; ENST: TranscriptVariants/Isoforms ; ENSE: Exons ; ENSP: Proteins