I want to download the mouse genome (mm9) with some basic annotations: Intergenic region, promoter region, TSS, gene start, gene end, CpG islands and so on.
Basically, I have split up the genome (mm9) into 200bp bins. Each bin has some characteristic associated with it (mainly histone modifications). I'd like to traverse the genome in 200bp bins and look at the annotation in the specific bin (mainly to relate histone modifications with certain parts of the genome - see Ernst & Kellis).
I think I can use biomart (which has a awesome package in bioconductor), but I can't see to get the ENTIRE genome. Biomart can give me promoter regions, TSS, gene start/end and so on. Can I assume that anything outside this area is intergenic region?
Thanks, I think this will be easier to use than biomart. Although I'll probably go ahead and use biomart as a "double check"/control.
Do you know if
TxDb.Mmusculus.UCSC.mm9.knownGene
contains RefSeq or Ensembl genes? I also useBSgenome.Mmusculus...
instead ofTxDb.Musculus...
Are they the same? (nvmd, BSGenome didn't install on my R. TxDb did I guess TxDb is an older version?)