I'm looking for a fasta file of all intergenic regions for mouse (mm9) and worm (ce10). Ideally, I want to find the equivalent of the NotFeature.fasta file for yeast (S288c), which can be found here:
This file "contains DNA sequences which are not contained within a feature. Features include ORF, ARS, CEN, rRNA, tRNA, snRNA, snoRNA, RNA genes, LTRs, telomeric elements and transposons."
If you can find annotations for all the features you are interested in and convert them to bed format, you should be able to use the complementBed program in Bedtools to produce a bed file of regions lacking annotations. You could then use getfasta (again, from Bedtools) to extract the fasta sequence from your non-annotated regions produced by complementBed.
I am sure there are more efficient ways but this will get you started.
Thanks Neal, this is a good solution. Bedtools is amazingly useful.
For mus, I've decided to use the 1000 bp segments upstream of all annotated transcription start sites, which can be downloaded directly here:
http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/bigZips/