Question

Galaxy tool for quantifying RNAseq data

0

Entering edit mode

7.2 years ago

kml ▴ 30

Hi all, I'm looking for a tool for quantifying my RNAseq data (Illumina pair ends). It should be: 1. available at the Galaxy interface (Melbourne server) 2. will normalize the data to both library size and to CDs length 3. input/work with genome alignments (preferably HISAT2), and not with alignments to transcriptome (the transcriptome that is available for my organism is problematic). does anyone has anything in mind?I would appreciate any suggestion. Thank you!

RNA-Seq quantification galaxy • 2.5k views

ADD COMMENT • link 7.2 years ago by kml ▴ 30

0

Entering edit mode

Thank you Devon Ryan for your answer. I run the featureCounts tool, as you suggested, and now I have a count output for every sample and in a separate files the lengths. Do I input the DESeq2 with the count matrix or I should calculate the fpkm first from the count data, library size and lengths?

ADD REPLY • link 7.2 years ago by kml ▴ 30

0

Entering edit mode

DESeq2 takes raw counts. Do not normalize, DESeq2 will take care of that.

ADD REPLY • link 7.2 years ago by swbarnes2 15k

0

Entering edit mode

I'm sorry, but I'm still confused. It is my understanding that DESeq2 normalizes only by library size, and I wish to normalize according to read length.(cause I want to be able to compare genes also within the sample). What should I do? Thank you!

ADD REPLY • link 7.2 years ago by kml ▴ 30

0

Entering edit mode

I used featureCounts to generate a count matrix, and with it and the lengths I calculated Excel TPM values. the issue now is how can I do statistics for the TPM values? (replicates and adj pval) I appreciate all your help.

ADD REPLY • link 7.2 years ago by kml ▴ 30

0

Entering edit mode

You don't do statistics with TPM values, you need to clearly explain what your end goal is, since comparing genes within samples is very unusual.

ADD REPLY • link 7.2 years ago by Devon Ryan 105k

0

Entering edit mode

I was not aware that it is unusual. I wish to be able to say, for example, that gene A was up-regulated, with a higher fold change, than gene B. So under treatment X gene A is more relevant. But in treatment Y, gene B is more up-regulated than gene A.

Why can't I do statistics after normalizing the counts and achieving the TPM values? I'm not sure I'm following. Thank you

ADD REPLY • link 7.1 years ago by kml ▴ 30

score 0 · Answer 1 · 2018-05-30

0

Entering edit mode

7.2 years ago

Devon Ryan 105k

featureCounts, which you can ask Simon (the admin of that instance) to install if it's not already there.
Use the output of 1 in DESeq2, which can also be trivially installed if not already there.
Not an issue

ADD COMMENT • link 7.2 years ago by Devon Ryan 105k