downloading annotated mouse genome
1
0
Entering edit mode
9.7 years ago
Affan ▴ 310

I want to download the mouse genome (mm9) with some basic annotations: Intergenic region, promoter region, TSS, gene start, gene end, CpG islands and so on.

Basically, I have split up the genome (mm9) into 200bp bins. Each bin has some characteristic associated with it (mainly histone modifications). I'd like to traverse the genome in 200bp bins and look at the annotation in the specific bin (mainly to relate histone modifications with certain parts of the genome - see Ernst & Kellis).

I think I can use biomart (which has a awesome package in bioconductor), but I can't see to get the ENTIRE genome. Biomart can give me promoter regions, TSS, gene start/end and so on. Can I assume that anything outside this area is intergenic region?

genome • 3.3k views
ADD COMMENT
0
Entering edit mode
9.7 years ago

There isn't a single annotation file with everything you want, since most people don't want all of that at the same time. Since it seems you're already using bioconductor, you might find the TxDb.Mmusculus.UCSC.mm9.knownGene package useful. It will allow you to get Promoters, TSS, gene starts/ends and so on. You can also get intergenic regions, but there's not a single command to do it (see my reply here). For things like CpG islands, just download the track in BED format from UCSC and load it with the rtracklayer package. Most other things you might want are also available via UCSC.

ADD COMMENT
0
Entering edit mode

Thanks, I think this will be easier to use than biomart. Although I'll probably go ahead and use biomart as a "double check"/control.

Do you know if TxDb.Mmusculus.UCSC.mm9.knownGene contains RefSeq or Ensembl genes? I also use BSgenome.Mmusculus... instead of TxDb.Musculus... Are they the same? (nvmd, BSGenome didn't install on my R. TxDb did I guess TxDb is an older version?)

ADD REPLY

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6