Hello all !
I am totally new to R and I'm trying to run a function from a specific package (taxonomizr) : getId.
It works like this :
data<-getId(c('Pestivirus A','Bos taurus','Homo'),taxaNames)
and as an output you get a table with one row with all the id associated to each species.
I have a lot of inputs (around 7000) that i concatenante in .txt file like this
'Pestivirus A','Bos taurus','Homo'...
I have tried to copy paste all the .txt file in the argument of the getId function. But when I run the command, nothing happens and I have symbol + instead of > in the console. Copy pasting only works for maximum about hundred of inputs (out of 7000).
Is there a way to use .txt file to avoid doing 70 copy pastes ?
I add it's the first time I am using R. So far I've imported my data like this and that's all
species <- read.csv("C:/Users/pdoinel/Desktop/species.txt", header=FALSE)
Ok I get as a reult : Error in out[taxa] : type 'list' d'indice incorrect
have used
Ok I just did that but I have nothing as output.
My .txt file is :
Print (df) is :
Print (species) is :
You should show some lines of input .txt file at the very beginning. How may lines in it?
If there are more than one lines, use loop.
It is the only line. I tried the command with this small subset, noting more. Could the problem come from the  ?
You may change the encoding of the text file as
UTF8
.The problem does not only come from the enconding. I did df[1] <- NULL to delete this "bad variable". and it stills doesnt work. However, when I do copy paste this subset, it works. EDIT :actually it works for the first species of my.txt file (after deleting the encoding).
Seems the one-line format is not the original format, one-name-per-line could be the easiest and most convenient for downstream processing.
And 7000 is not a small number, you can try taxonkit name2taxid for mapping scientific name to TaxIDs or further retrieve lineage, which supports windows but you need run in command line console.