Hi All,
I have a mixture of RNA-seq data which might include humans, viruses, bacterias RNAs. I have several questions:
How to download comprehensive fasta files so that I can build a bowite2 index?
Which script can be used to check the proportion of RNAs reads from each species?
It looks NCBI blast could do it, any script can be used to summary the NCBI blast result?
Thanks.
Solved: BBSplit is exactly what I need. Thanks.
About BBSplit
BBSplit is a tool that bins reads by mapping to multiple references simultaneously, using BBMap. The reads go to the bin of the reference they map to best. There are also disambiguation options, such that reads that map to multiple references can be binned with all of them, none of them, one of them, or put in a special "ambiguous" file for each of them. Paired reads will always be kept together.
About Xenome:
We present a technique, with an associated tool Xenome, which performs fast, accurate and specific classification of xenograft-derived sequence read data. We have evaluated it on RNA-Seq data from human, mouse and human-in-mouse xenograft datasets.
I've moved my comment to an answer. Please accept it.