I'm in the process of mapping a heap of RNAseq data and am unsure of the optimal parameters to set for the maximum number of hits for a read.
Previously we have outsourced all of our reference mapping and the company that we used has allowed a maximum of 1 hit per read. I assume it was this stringent as our reference was assembled de novo, but to me this seems like it may be a bit of overkill. For example, CLC uses a default threshold of 10 hits/read.
I'm aware that like most parameters this is going to be specific to the dataset, but is the integrity of the analysis going to be compromised by increasing the maximum number of hits/read?
You would need to make that decision. If you were to use BBMap then you have the following options for handling multi-mappers: