Maximum Number of Hits Per Read when Mapping RNAseq Data
0
0
Entering edit mode
8.1 years ago

I'm in the process of mapping a heap of RNAseq data and am unsure of the optimal parameters to set for the maximum number of hits for a read.

Previously we have outsourced all of our reference mapping and the company that we used has allowed a maximum of 1 hit per read. I assume it was this stringent as our reference was assembled de novo, but to me this seems like it may be a bit of overkill. For example, CLC uses a default threshold of 10 hits/read.

I'm aware that like most parameters this is going to be specific to the dataset, but is the integrity of the analysis going to be compromised by increasing the maximum number of hits/read?

RNA-Seq Assembly alignment sequence sequencing • 1.5k views
ADD COMMENT
0
Entering edit mode

You would need to make that decision. If you were to use BBMap then you have the following options for handling multi-mappers:

ambiguous=best          (ambig) Set behavior on ambiguously-mapped reads (with multiple top-scoring mapping locations).
                            best    (use the first best site)
                            toss    (consider unmapped)
                            random  (select one top-scoring site randomly)
                            all     (retain all top-scoring sites)
ADD REPLY

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6