Hi. I want to get the exon annotation of a list of chromosome interval (hg19). Here is the code:
getBM_value <- list(
chromosome_name = bed_df$chromosome_name,
start = bed_df$start,
end = bed_df$end
)
mart <- useDataset('hsapiens_gene_ensembl', useMart('ensembl', host="grch37.ensembl.org"))
fupanel_bed_anno <- getBM(attributes = c('chromosome_name', 'exon_chrom_start', 'exon_chrom_end',
"strand", "ensembl_gene_id","ensembl_exon_id"),
filters = c('chromosome_name', 'start', 'end'),
values = getBM_value,
mart = mart)
But it returns an error:
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [grch37.ensembl.org:80] Operation timed out after 300001 milliseconds with 163661 bytes received
I've tried the mirror argument.
mart <- useEnsembl(biomart='ensembl', dataset='hsapiens_gene_ensembl', mirror = "uswest", GRCh = 37)
Warning message:
In useEnsembl(biomart = "ensembl", dataset = "hsapiens_gene_ensembl", :
version or GRCh arguments can not be used together with the mirror argument.',
'We will ignore the mirror argument and connect to main Ensembl site.
fupanel_bed_anno <- getBM(attributes = c('chromosome_name', 'exon_chrom_start', 'exon_chrom_end',
"strand", "ensembl_gene_id","ensembl_exon_id"),
filters = c('chromosome_name', 'start', 'end'),
values = getBM_value,
mart = mart)
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [grch37.ensembl.org:443] Operation timed out after 300001 milliseconds with 323405 bytes received
Tagging: Emily_Ensembl
Tagging: Mike Smith
Trying a mirror is a good idea, but can't you see that your command with an alternate mirror site clearly wasn't carried out as you intended?
How big is your bed file?
232kb with a total of 7,983 rows
And how big are the regions in the bed file?
The bed file includes exons of 500 genes. 100 bp on average per row.