How can I get Ensembl CDS / 3'UTR and 5'UTR sequences
1
1
Entering edit mode
5.1 years ago
2405592M ▴ 150

Hi all,

I've finished doing an RNA-seq analysis and I've done some differential expression on my data set. I now want to do some codon usage analysis between conditions for my up and down regulated genes. How can one get the 3'UTR / CDS / 5'UTR sequences with ensembl IDs?

Thanks in advance!

RNA-Seq codon usage ensembl sequence • 3.1k views
ADD COMMENT
1
Entering edit mode
5.1 years ago
Emily 24k

BioMart! Filter by your list of Ensembl IDs and get the sequences as Attributes. There's a help video to get you started if you've never used BioMart before.

ADD COMMENT
0
Entering edit mode

Hi Emily,

I'm trying to do this through R using the following:

library(biomaRt)
listMarts()

ensembl <- useMart("ensembl")

datasets <- listDatasets(ensembl)
head(datasets)

ensembl = useDataset("mmusculus_gene_ensembl", mart = ensembl)

attributeNames_codon_usage <- c("ensembl_gene_id", "5utr", "coding", "3utr")

ourFilterType <- "ensembl_gene_id"

filterValues <- rownames(Day4_CONvsCRE)

codon_usage_mm10_annot_Day4_CONvsCRE <- getBM(attributes = attributeNames_codon_usage,
                                              filters = ourFilterType,
                                              values = filterValues,
                                              mart = ensembl)

but I keep getting the following error:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 4 elements

ADD REPLY
0
Entering edit mode

You can't select multiple sequence types in one query, you would need to run them as three queries.

I would also recommend getting the transcript ID in your attributes, as you'll get a sequence for each one.

ADD REPLY

Login before adding your answer.

Traffic: 2003 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6