I have patient tumours implanted in nude mice. I want to extract the RNA from tumours after a certain period of time and do RNA-Seq. There will be no doubt mouse RNA floating in there as well.
I am wondering if the mouse-human divergence is sufficiently high to be able to filter mouse reads out when I get my sequencing?
I would actually be interested in what is going on with expression of mouse genes, so maybe I could map reads to both genomes simultaniously?
I am thinking 125bp single end reads, is it crucial to get PE reads?
Before doing any wet-lab experiment, I would collect some public RNA-Seq datasets (e.g. from GEO) as similar as possible to your experiments (i.e. human tumours and mouse normal tissues). Then I would create a hybrid human-mouse genome/transcriptome (i.e. by concatenating the reference genomes and annotation files) and map these datasets to both this hybrid reference and to human-only genome/transcriptome. Then check where and how they differ. In this way you can also assess the importance of read length and SE vs PE.
I think it's important to map to a single hybrid genome in a single pass rather then mapping first to one then to the other genome. In this way you can use mapq scores to tell whether a read maps significantly better to human rather then mouse or vice versa.
I can tell you that human and mouse are sufficiently divergent to throw off mapping. When I have accidentally mapped mouse samples to human genome or vice versa, I typically get around 10-30% mappable reads.
However, that 10-30% accidental mismatch will certainly introduce unintended biases into your samples.
I would recommend that you try to get your samples are pure as possible and run a separate experiment if you want to know about mouse-specific expression.