GSVA Analysis
1
0
Entering edit mode
13 months ago
Ranjini ▴ 10

Hi All

I am attempting to perform GSVA analysis in R.

This is my code-

library(GSVA)

gene_expression <- as.matrix(read.csv("C:/Users/Documents/expression data.csv"))
gene_expression_matrix <- gene_expression[, -1]  # Exclude the first column with gene identifiers
rownames(gene_expression_matrix) <- gene_expression[, 1]

# Create a named list with a single gene set
my_gene_set <- list(
  Angio = c("VEGFA", "FGF2", "Angpt1", "HGF", "PDGFB", "EGF", "IGF", "ANG"))

# Function to check if a row has non-numeric values
has_non_numeric <- function(row) {
  any(!is.na(row) & !is.numeric(row))
}

# Filter the matrix to exclude rows with non-numeric values
filtered_gene_expression <- gene_expression_matrix[apply(gene_expression_matrix, 1, has_non_numeric), ]

scores <- gsva(filtered_gene_expression, my_gene_set, method = "gsva")

I keep getting this error-

Error in rowVars(x, rows = rows, cols = cols, na.rm = na.rm, center = center,  : 
  Argument 'x' must be of type logical, integer or numeric, not 'character'.

I have checked my expression data for non-numeric entries several times , I do not know what is causing the error.

I would appreciate any help on this!

Best,
B

RNA-seq GSVA GSEA • 1.2k views
ADD COMMENT
3
Entering edit mode
13 months ago

Your filtering is only keeping rows that contain any non-numeric values, which would be every row in your case since your matrix is of type character.

To fix the problem you need to first coerce your matrix to numeric. Any non-numeric values in the matrix will be converted to NA, so you can then just filter rows out that contain any NA values.

This line

# Filter the matrix to exclude rows with non-numeric values
filtered_gene_expression <- gene_expression_matrix[apply(gene_expression_matrix, 1, has_non_numeric), ]

Should be.

# Coerce the matrix to numeric.
# Non-coercible characters (like letters) will be converted to NA.
class(gene_expression_matrix) <- "numeric"

# Keep only rows with no NA (non-numeric) values.
filtered_gene_expression <- gene_expression_matrix[complete.cases(gene_expression_matrix), ]
ADD COMMENT
1
Entering edit mode

Thank you!

This solved the issue

ADD REPLY
0
Entering edit mode

Is there a way to calculate the significance of the difference between the GSVA scores for the different samples above? I see some bootstrapping techniques suggested online but cannot find more

ADD REPLY
0
Entering edit mode

Section 6.2 of the GSVA docs discusses how to perform differential pathway analysis using limma.

ADD REPLY

Login before adding your answer.

Traffic: 1962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6