R Script For Extract Levels
0
0
Entering edit mode
10.8 years ago
sebabiokr ▴ 10

Dear B, I got error with heat map in R i want a R function to extract taxonomy levels in otu table as follows

  extract.name.level = function(x, level){
        a=c(unlist(strsplit(x,';')),'Other')
        paste(a[1:min(level,length(a))],collapse=';')
    }

OTU table

OTU ID    Control    P1    S1    taxonomy
3506234    1298    466    1074    Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Proteiniphilum
New.ReferenceOTU31    731    0    2    Bacteria; Bacteroidetes; Sphingobacteria; Sphingobacteriales
4444585    700    374    520    Bacteria
New.CleanUp.ReferenceOTU1531    606    84    26    Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales; Alcaligenaceae; Tetrathiobacter
670013    441    17    172    Bacteria; Synergistetes; Synergistia; Synergistales; Synergistaceae
661526    405    90    49    Bacteria; Synergistetes; Synergistia; Synergistales; Synergistaceae; Cloacibacillus
4367900    374    103    556    Bacteria; Tenericutes; Mollicutes; Acholeplasmatales; Acholeplasmataceae; Acholeplasma
817135    358    30    206    Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio

Thank you

Seba

r parsing • 3.7k views
ADD COMMENT
0
Entering edit mode

Judging from your function, are you trying to extract rows of that table having a given string in the taxonomy field or something else? Perhaps you also just want to know what levels there are, which would be something like:

> unique(unlist(strsplit(d$taxonomy, "; ")))
 [1] "Bacteria"            "Bacteroidetes"       "Bacteroidia"        
 [4] "Bacteroidales"       "Porphyromonadaceae"  "Proteiniphilum"     
 [7] "Sphingobacteria"     "Sphingobacteriales"  "Proteobacteria"     
[10] "Betaproteobacteria"  "Burkholderiales"     "Alcaligenaceae"     
[13] "Tetrathiobacter"     "Synergistetes"       "Synergistia"        
[16] "Synergistales"       "Synergistaceae"      "Cloacibacillus"     
[19] "Tenericutes"         "Mollicutes"          "Acholeplasmatales"  
[22] "Acholeplasmataceae"  "Acholeplasma"        "Deltaproteobacteria"
[25] "Desulfovibrionales"  "Desulfovibrionaceae" "Desulfovibrio"
ADD REPLY
0
Entering edit mode

What's your problem precisely? The function seems to work and extract the first n levels separated by ';' if x is a character vector of length 1:

x = "Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Proteiniphilum"
extract.name.level(x,1)
[1] "Bacteria"
extract.name.level(x,2)
[1] "Bacteria; Bacteroidetes"
 ...
extract.name.level(x,7)
[1] "Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Proteiniphilum;Other"

Not sure what the "Other" is for, doesn't make much sense to me to have this.

ADD REPLY
0
Entering edit mode

How is that connected to heatmaps?

ADD REPLY
0
Entering edit mode

If I understand it correctly you mean to use the taxonomy columns as row names for your heatmap and the three columns in the middle as the heatmap itself Is it correct?

Do you need only the last taxonomy ID to be extracted?

ADD REPLY
0
Entering edit mode

Thank you for quick response... Yes i followed the tutorial to make heatmap but got error with taxa Id in the heatmap http://learningomics.wordpress.com/

It produced heatmap without taxa id

ADD REPLY

Login before adding your answer.

Traffic: 1965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6