Does removing host RNA before mapping reads to pathogen genome adds any benefits?
I'm worried about the false positive host RNA could give.
I'm working with pig and bacterial data.
Thanks!
Does removing host RNA before mapping reads to pathogen genome adds any benefits?
I'm worried about the false positive host RNA could give.
I'm working with pig and bacterial data.
Thanks!
In my experience, one does not benefit that much from such an approach in this case (mammal vs bacterium). Besides, wouldn't you still have the false positive problem if you pre-mapped to pig? (i e wrongly assigning bacterial reads to pig) Anyway, I think there is enough divergence that you would not get many false positive hits.
I did such a project (mouse + a certain bacterium) and there the problem was rather that we had a bacterial genome that we mapped to and got a lot of hits to, but in the end it turned out that most of those were from other bacteria (mostly E coli) that were contaminating the sample. A useful way to spot such erroneous cross-bacterial mappings turned out to be looking at the "mismatch spectrum" I e how many of your reads have 0, 1, 2 ... mismatches in the alignment to the genome you are mapping against (this is easy to get from the BAM/SAM). It should be mostly zeroes and ones, but in our case the most common number of mismatches was 6 or so. So the bacteria were close enough that reads from one could be mapped to the other, but always with many mismatches. Perhaps that could be helpful for you as well.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks Mikael, some references would be also appreciated.
This is the study I mentioned:
http://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1004600
I'm afraid I don't have anything else I can think of right now.
A great reference Mikael, although I'm surprised how you were able to publish this in vivo data. Extremely low abundance! Well, finding positive correlation among replicates seems to help!
I was surprised too :)
The paper you refer has a LOT of work, apart from the RNA-seq runs.