How to extract rows with specific values in R for all the columns?
2
1
Entering edit mode
8 months ago
Lakshmi ▴ 20

template of CN ration file in .csv format

Hi,

Above figure is an example of a large copy number ratio file in .csv format that I am currently trying to analyze. In this, I want to only selectively extract the genes that either have CN ratio above 2 or below 0.2 and extract all the column data for the entire row that is positive for the condition. In the example data above, I want to extract entire row data for gene 3, gene 4, gene 5, gene 6 for all the columns without specifying row names, but using >2 or <0.2 conditions as I am analyzing 20000 rows. Kindly let me know how to proceed with the same using R programming.

Thanks for your help in advance!

R • 637 views
ADD COMMENT
4
Entering edit mode
8 months ago

Two tidyverse ways.

Load the required libraries and read in the data.

library("dplyr")
library("readr")

data <- read_csv("input.csv")

You can use rowwise.

filtered_data <- data |>
  rowwise() |>
  filter(any(c_across(!Genes) > 2 | c_across(!Genes) < 0.2)) |>
  ungroup()

Or the more traditional data pivoting.

library("tidyr")

filtered_data <- data |>
  pivot_longer(!Genes) |>
  group_by(Genes) |>
  filter(any(value > 2 | value < 0.2)) |>
  ungroup() |>
  pivot_wider(names_from=name, values_from=value)
ADD COMMENT
0
Entering edit mode

Thanks to rpolicastro and @zx8754 for your response. All 3 codes worked for me. I really appreciate your help!

ADD REPLY
3
Entering edit mode
8 months ago
zx8754 12k

Using base R, check every value if they meet the condition, then subset

data[ which(rowSums(data[, -1 ] > 2 | data[, -1 ] < 0.2) > 0), ]
ADD COMMENT

Login before adding your answer.

Traffic: 2268 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6