Hi, I have RNASeq data (paired-end) from a plant-microbe interaction study at different conditions. I mapped the reads to the plant genome using tophat2 and got 32.3% to 89.2% overall read alignment rate (30.6% to 86.4% concordant pair alignment rate). 1. How to ensure that the mapping results are good? 2. How to check whether in fact the mapped reads are correct, I mean how to check that all mapped reads are of the plant and some reads from the bacteria are not contaminating the result in the output file?
You could use a metagenomics tool, like Kraken, to quickly estimate what contaminants are present in your sample. It won't tell you about issues mapping reads, but you'll at least know the source of your problem.
thanks joe. I'll try it. Kraken will tell the source of contamination, if it finds such reads? This means I'll have to give my reference. Can you please direct me to some links for Kraken manual and tutorials?