Hello fellow BioStars!
I've ben working for a time now using the CastEi and 129S5 sequences generated by the Mouse Genomes Project from the Sanger.
Something I've always wished for is to have a correspondence between the coordinates of the reference genome (C57Bl6) and these genomes, as it is useful for making quick table comparisons and computing other info. However, I do't seem to think on a way to perform this task, or if it has already been done.
Any ideas?? I'd appreciate very much all your comments.
Cheers!
I'm confused. Sanger generated reads, which were mapped onto the C57BL/6 genome using MAQ. You can try to assemble your own complete CastEi genome, but I don't think it's a good idea since the indels aren't as reliable and it would be hard to map genome coordinates between assemblies. What's wrong with using the C57BL/6 coordinates as a standard reference? That's what I'm doing in a similar project at the moment.
Thanks both for your replies. My issue is that, for example, if I call with seqIO to obtain a region from one chromosome in C57Bl6, and if I use the same command for CastEi genomes, it gives me different sequences. I think I whould then just use the BAM files for this operations.
You would get different sequences in CaseEi and C57 for a given region because the castaneous genome isn't the same as C57.
I'm not in mouse as David seems to be, but what I understood when I read about the mouse genomes project was that they sequenced different mouse strains as if they were sequencing different mouse populations, but since all strains are from the same organism Mus Musculus they were all mapped to the same reference NCBIM37, hence the genome coordinates should be directly comparable. I'm not writing this as an answer since, as I said, I'm not an expert on mouse, but I really hope this point that David first arose would help you to realize all the potential information you already have in your hands.