Hi. I have a data frame of ensembl gene IDs and their corresponding canonical transcript IDs. Below is a small subset of this data frame:
geneID transcriptID
ENSG00000186092 ENST00000335137
ENSG00000187634 ENST00000420190
ENSG00000187642 ENST00000341290
ENSG00000188157 ENST00000379370
ENSG00000186891 ENST00000379265
I would like to retrieve the protein coding sequences for these genes using the R package biomaRt. Using the code below I can do this:
mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
seq = getSequence(id = "ENSG00000186092",
type = c("ensembl_gene_id"),
seqType = "coding",
mart = mart)
My issue is that I would like to add a second filter of transcript ID so that I am only getting the gene sequence for the specific transcript. However, I am not sure how to do this for the getSequence function. I have tried to add a value argument (as with getBM) but this gives the unused argument error message. Does anyone know how to do this?
Hi Emily, thanks that does help. However is it actually possible to use two different filters?
You can filter by multiple types of ID, but you can filter by a list of IDs, plus a biotype or something.
Hi Emily, did you mean "You can <<not>> filter by multiple types of ID, but ..." ?