Question

ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data)

1

Entering edit mode

4.4 years ago

amirnavidinia2014 ▴ 10

Hi there Can anyone explain to me how to use the ESTIMATE package in RNA-seq analysis? I want to calculate immune scores and stromal scores by employing the ESTIMATE algorithm, then analyze the relationship of immune/stromal scores with subtype classification and cytogenetic risk by one-way analysis of variance, but I don't know how to do this!

I will be grateful for any help you can provide.

RNA-Seq R ESTIMATE • 6.6k views

ADD COMMENT • link updated 10 weeks ago by F • 0 • written 4.4 years ago by amirnavidinia2014 ▴ 10

0

Entering edit mode

Can you please elaborate on what you have already tried and on which part (or parts) you are having trouble? Thank you.

ADD REPLY • link 4.4 years ago by Kevin Blighe 88k

0

Entering edit mode

How can I use this package? I want to know whether I should normalize data before use or not? And at last, how can I read the .gct output file?

ADD REPLY • link 4.4 years ago by amirnavidinia2014 ▴ 10

0

Entering edit mode

How to prepare the .gct file to use in DESeq2

ADD REPLY • link 4.4 years ago by amirnavidinia2014 ▴ 10

score 6 · Accepted Answer · 2020-08-16

6

Entering edit mode

4.4 years ago

Kevin Blighe 88k

A vignette PDF comes installed with the package, and should be located at:

R/x86_64-pc-linux-gnu-library/4.0/estimate/doc/ESTIMATE_Vignette.pdf

In this vignette, they use some data that comes bundled with the package (R/x86_64-pc-linux-gnu-library/4.0/estimate/extdata/sample_input.txt), which represents Affymetrix U133 microarray data that appears to be normalised and transformed by log [base 2].

So, if you have RNA-seq data, I would normalise the data in the usual way, and then transform via rlog or vst. Then, with ESTIMATE, use the rlog or vst expression levels.

Kevin

ADD COMMENT • link 4.4 years ago by Kevin Blighe 88k

0

Entering edit mode

Hi,

First of all thanks for your helpful reply.

I am trying ESTIMATE tool in RNA-Seq data aswell.

I have normalized and transformed data via rlog but even if I checked the man and help option, the steps are unclear for me.

Should I use first "filterCommonGenes" option and get "genes.gct" file? With the given codes below will I obtain "OV_estimate_score.gct" and get raw estimation using these files with "estimateScore" commond?

   out.file <- tempfile(pattern="estimate", fileext=".gct")
    outputGCT(in.file, out.file)

I am sorry, I confused a lot through infos.

ADD REPLY • link 3.4 years ago by MS ▴ 40

0

Entering edit mode

Hi, I think that function (outputGCT()) just changes the format of the data. You have read through the vignette, right?

ADD REPLY • link 3.4 years ago by Kevin Blighe 88k

0

Entering edit mode

Yes I read it and followed the codes given below. BTW I don't have repeated GeneSymbols

in.file <- read.table("Normalized and rlog transformed DE lncRNAs.txt", sep = "\t", header=T)
lncRNAgct <- tempfile(pattern="estimate", fileext=".gct")
outputGCT(in.file, lncRNAgct)
estimateScore(lncRNAgct, "estimate_score.gct", platform="Illumina")

But it turned as;

[1] "1 gene set: StromalSignature  overlap= 0"
[1] "2 gene set: ImmuneSignature  overlap= 0"

Could it be because I am trying genes related with lncRNAs or am I doing something wrong?

ADD REPLY • link 3.4 years ago by MS ▴ 40

1

Entering edit mode

You'll be surprised to hear that I have not actually used this package.

It seems that your first argument, lncRNAgct, should actually be the input GCT filename, i.e., it's absolute or relative file location.

input.ds, character string specifying name of input GCT file containing stromal, immune, and estimate scores for each sample

output.ds, character string specifying name of output file

platform, character string indicating platform type. Defaults to "affymetrix"

[source: https://rdrr.io/rforge/estimate/man/e50-estimateScore.html]

I think that you have your parameters in incorrect places. See the example at the bottom of the page accessed via the link above

ADD REPLY • link 3.4 years ago by Kevin Blighe 88k

0

Entering edit mode

Hi, thank you for your reply.

I tried it but it is still turning as 0 overlapped.

When I checked the example system file, I realized exp. data does not have any negative value but my data has. Could it be problem?

ADD REPLY • link 3.4 years ago by MS ▴ 40

0

Entering edit mode

It says filterCommonGenes() takes input as the directory of your file or your data frame. I tried using the dataframe as input but it didn't work ,got this error:

(is.character(input.f) && length(input.f) == 1 && nzchar(input.f)) || .... is not TRUE)

which is indicating it only wants a directory in the form of a character string as input.. I tried that and got this:

[1] "Merged dataset includes 0 genes (10412 mismatched)." Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

Is there a specific format the input has to be in? you seemed to get it to work as a data frame

ADD REPLY • link 3.4 years ago by John ▴ 10

1

Entering edit mode

I got the same error (is.character(input.f) && length(input.f) == 1 && nzchar(input.f)) || .... is not TRUE) when I tried providing it a data.frame. I changed the input to a string with complete file path to my data.frame and it worked fine for me. Make sure your data.frame is vst/rlog transformed with row names as gene symbols and columns as samples.

ADD REPLY • link 3.4 years ago by patelk26 ▴ 320

0

Entering edit mode

are you starting off with raw counts or FPKMS (or another form) before log transforming? I read in their paper they used RPKM so I'm going to try FPKM first since they should be similar.

ADD REPLY • link 3.4 years ago by John ▴ 10

1

Entering edit mode

also what format is your dataframe in (.txt?) and what seperation are you using?

ADD REPLY • link 3.4 years ago by John ▴ 10

0

Entering edit mode

also what format is your dataframe in (.txt?) and what seperation are you using?

ADD REPLY • link 3.4 years ago by John ▴ 10

0

Entering edit mode

ah got it to work by looking at the sample data file. The format of the data frame .txt file has to be tab seperated and there can be no quotes around the character values

ADD REPLY • link 3.4 years ago by John ▴ 10

1

Entering edit mode

can you please provide the code you used. I am facing the same problem. Thank you

ADD REPLY • link 2.6 years ago by lolo_e ▴ 10

0

Entering edit mode

I think the data must be in GCT format: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats so the data that you are uploading as txt may need to be organized according to format and probably should have .gct extension. Instead of .txt

ADD REPLY • link 10 weeks ago by F • 0