Problems with the same gene names in R, any help?
1
0
Entering edit mode
3.9 years ago

Hello everyone, I need to make a gene enrichment with clusterProfiler tool implemented in R. I have a gene list with only gene names in it. In the data I have now, there was some NAs and I eliminated them with na.omit() function. Now, I have a clear list but I need to make sure that my data doesn't have the same gene names more than one. Later, I need to reunite the truncated data. I couldn't find the right arguments and commands for those procedures. Sorry for inconvenience, and thank you very much for the help, Mervenur

R gene • 1.0k views
ADD COMMENT
0
Entering edit mode

Why would your data have multiples of gene names? Maybe picking one at random is a bad idea because they represent different things.

ADD REPLY
0
Entering edit mode

They're all miRNA target genes, all I know is that I need to make sure my data only consists only one of different gene names

ADD REPLY
0
Entering edit mode

group by genes and collapse all the miRNAs (with , separation). That way you would not loose any information. Remove any remaining NAs.

ADD REPLY
0
Entering edit mode
3.9 years ago
lessismore ★ 1.4k

unique() function rom base R or distinct() from dplyr package

ADD COMMENT
0
Entering edit mode

Thank you so much, that was really helpfull I used distinct () function from dplyr package. But after that for combining the truncated data I used unite() function from tidyr package but system gave me an error. Do you know why? Or what should I do instead?

ADD REPLY
1
Entering edit mode

Hi, i've no idea what your data looks like. You should post what you start from and what you're trying to achieve. A small reproducible example will let people better understand your question

ADD REPLY
0
Entering edit mode

I am writing down the commands I used and what I am trying to achieve. genes <- read.table("dataname.txt", header = TRUE, sep = "\t") gene_list <- na.omit(genes) library(dplyr) gene_names <- distinct(gene_list, keep_all = TRUE) Now I have gene names and I want to unite truncated data. Later, with enrichGO() function from clusterProfiler I wanna make gene enrichment. Before that function I assume I need to find gene IDs. But don't have any idea how to either. Sorry for inconvenience and thank you so much

ADD REPLY

Login before adding your answer.

Traffic: 1930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6