I have established ensembl as my database but I can not get this code to work....
Any help?
results <- getBM(attributes = c("ensembl_gene_id", "hgnc_symbol"), filters = "hgnc_symbol", values = genes$hgnc, mart = mart)
Error in getBM(attributes = c("ensembl_gene_id", "hgnc_symbol"), filters = "hgnc_symbol", :
Values argument contains no data.
What is the output of
head(genes$hgnc)
?this returns.... NULL. what do i have to do to give it a value
Your vector
genes$hgnc
seems to contain no rows.. trylength(genes$hgnc) ; class(genes$hgnc)
..oh yes i have created a list of genes in a csv file and i have it in the right directory and i changed the "my file" to my csv title.
changing the values to tp53 and sry worked but my list is 200 genes long.... i'm hoping to get the requested info (specified in the code) on the whole list... is that possible that it will search ensembl for this info for all of the genes in my list with the right code?
Please comment under answers rather than posting new questions as answers.
If what you say regarding the CSV file is true then the value of
genes$hgnc
cannot be NULL. Check again.BioMart should have no issues with 200 genes; I routinely use query vectors with thousands of entries.
Is this where u want me to post? yes the csv file is real and
genes$hgnc
just returnsNULL
.The full code is as follows:
And
results
returns:What do the first few lines of
chromatin.csv
look like? (including the header row)OK. First, that is not a CSV file - although
read.csv()
will still read it in just fine.Second, the first row does not contain the value "hgnc". That is where
genes$hgnc
comes from. read.csv() assumes by default that the first row contains the column header, so what you have isgenes$TDRKH
.There are lots of ways to fix this. One of them is to use
read.csv()
with theheader = FALSE
argument. Your column header will then beV1
, so you would usegenes$V1
. But you should read the help pages or some online tutorials forread.csv()
andread.table()
, so you understand how to read data into R.So does that mean that everywhere in the code that I have hgnc e.g. hgnc_symbol I should replace the hgnc with TDHRK?
No. That means that the first line of your file should be "hgnc", not "TDHRK". As I said,
read.csv()
assumes that the first line contains a header by default. If your first line is "TDHRK" and you don't specifyheader = FALSE
, then R assumes that the column header is "TDHRK".