Numerous sequencing studies, including transcriptomics of host-pathogen systems, sequencing of hybrid genomes, xenografts, mixed species systems, metagenomics, and metatranscriptomics, involve samples containing genetic material from divergent organisms. A crucial step in these studies is identifying from which organism each sequencing read originated, and the experimental design should be directed to minimize biases caused by cross-mapping of reads to incorrect source genomes. Additionally, pooling of sufficiently different genetic material into a single sequencing library could significantly reduce experimental costs but requires careful planning and assessment of the impact of cross-mapping. Having these applications in mind we designed Crossmapper, the first tool able to assess cross-mapping prior to sequencing, therefore allowing optimization of experimental design. The workflow of the pipeline (A) and example output (B) are illustrated below:
Further details about Crossmapper are in our recently accepted paper in Bioinformatics journal and in our GitHub.
We hope that Crossmapper will be useful for the community and of course any suggestions and recommendations are welcome.
+1 for
conda
packagethanks! Also, you can plug-in whichever mapping software you want into the pipeline:)