Output gene symbol instead of ENST ID in Kallisto

0

Entering edit mode

2.4 years ago

smurph50 ▴ 50

My goal is to get a counts matrix with gene symbols from bulk RNA seq data using Kallisto.

I ran kallisto and got an abundance.tsv with target_id ensembl transcript names (ENST00000361624.2, ENST00000355349.4).

When I convert to gene symbol, I have to drop the isoform portions (ENST00000361624.2 --> ENST00000361624). This results in multiple rows with the same gene symbol.

Can I map directly to the gene symbol instead of ensembl transcript name?

kallisto seq align mapping rna bulk • 1.3k views

ADD COMMENT • link updated 2.4 years ago by rpolicastro 13k • written 2.4 years ago by smurph50 ▴ 50

2

Entering edit mode

See tximport to correctly summarize transcript abundances to gene level. Also, the number after the period in the ensembl transcript id is the transcript version, which keeps tracks of cases where the model of the transcript has changed in the reference genome.

ADD REPLY • link 2.4 years ago by rpolicastro 13k

Login before adding your answer.