Question

Length by gene from salmon

0

Entering edit mode

8 months ago

marco.barr ▴ 150

Hello everyone,

I have a doubt regarding the type of length that comes out of the file quanto.sf during RNAseq analysis with salmon. I know that salmon quantifies the reads based on transcripts, and then tximport associates this with gene IDs. However, when I open my quant file, in the Name column I have all the IDs separated by | including those of the genes and also the gene names for various isoforms. In tximport, I correct this with ignoreAfterbar = TRUE. In the length column, I have different values for length associated with the same gene. How can I obtain a unique length value for the genes, considering they are ordered by transcripts in the quant file? I need this to use normalization on the row counts data. Is there a way? I hope it was clear. Thanks to everyone for the support.

salmon RNA-seq • 504 views

ADD COMMENT • link 8 months ago by marco.barr ▴ 150

score 1 · Answer 1 · 2024-03-19

1

Entering edit mode

8 months ago

ATpoint 85k

tximport returns a list with a length slot. That is the average transcript length gene. Use that. The ignoreAfterBar is only relevant to make a match between transcript name and tx2gene map. It does not 'correct' for anything.