How to split gene names in my list?
2
1
Entering edit mode
9.9 years ago
Parham ★ 1.6k

Hi,

I am working in R environment and I have a long list of gene names which I am showing few first objects of it here:

[1] SPNCRNA.1436,omh5,snR95
[2] snR46
[3] snR10
[4] SPNCRNA.1651,SPNCRNA.515
[5] snR42
[6] SPNCRNA.1094,SPNCRNA.1095,SPRRNA.47,SPRRNA.48
[7] snR88
[8] SPNCRNA.497
[9] SPSNORNA.54
[10] snoR39b

I am wondering if there is any way to split the indexes with several gene names in them into individual ones? The list is so long.

Thanks!

R • 3.4k views
ADD COMMENT
3
Entering edit mode
9.9 years ago
TriS ★ 4.7k

to elaborate a little on RamRS answer:

x <- c("SPNCRNA.1436,omh5,snR95","snR46", "snR10", "SPNCRNA.1651,SPNCRNA.515")
results <- c()
for (i in 1:length(x)){
  n <- 1
  xi <- strsplit(x[i], ",")
  results <- c(results, xi[[1]][n])
  # print(xi[[1]][n], sep="")
  while (!is.na(xi[[1]][n+1])){
    n <- n+1
    # print(xi[[1]][n], sep="")
    results <- c(results, xi[[1]][n])
  }
}

results
[1] "SPNCRNA.1436" "omh5"         "snR95"        "snR46"        "snR10"        "SPNCRNA.1651" "SPNCRNA.515"

--added some edits--

RamRS already gave you a perfect code, I still modified mine above

with this you can then save the results in whichever way you prefer.

x is your initial vector with the names, if it's part of a column then x will be myMatrix[,(# col with x)] or myDataFrame$x

ADD COMMENT
0
Entering edit mode

Thanks for your time writing the code. The thing is the gene list contains about 500 entries and I need to apply it to whole list. Besides the gene names are not alternative names of one gene. So only splitting them would be enough. I appreciate if you could give some help on that.

ADD REPLY
1
Entering edit mode

I've just added more details on my answer. HTH.

ADD REPLY
1
Entering edit mode

added some edits too, it can handle any number of entries, no matter if 10 or 500

ADD REPLY
0
Entering edit mode

Thank you guys both. Sorry for replying late. I've been caught up with some stuff.

The code you showed me was a big lesson.

Cheers!

ADD REPLY
2
Entering edit mode
9.9 years ago
Ram 44k

You can sapply with strsplit to split each element of the vector by ,

Edit: Sorry, I was half asleep when I wrote this answer, so could not test code before posting it here. I've now added the logic behind my solution, code and output.

Logic:

Apply strsplit on each element of the vector and flatten the resulting list of vectors using unlist.

Code:

> x <- c("SPNCRNA.1436,omh5,snR95","snR46", "snR10", "SPNCRNA.1651,SPNCRNA.515","SPNCRNA.1094,SPNCRNA.1095,SPRRNA.47,SPRRNA.48")
> listOfSplitStringVectors<-sapply(x,function(i) strsplit(i,",")
> flattenedVectorOfNames=unlist(listOfSplitStringVectors,recursive = TRUE,use.names = FALSE)

Output:

[1] "SPNCRNA.1436" "omh5"         "snR95"        "snR46"        "snR10"       
[6] "SPNCRNA.1651" "SPNCRNA.515"  "SPNCRNA.1094" "SPNCRNA.1095" "SPRRNA.47"   
[11] "SPRRNA.48"
ADD COMMENT

Login before adding your answer.

Traffic: 3099 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6