Hi,
I have a little understanding problem on several tophat parameters :
-r/--mate-inner-dist <int> This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.
--mate-std-dev <int> The standard deviation for the distribution on inner distances between mate pairs. The default is 20bp.
Is this the mean and sd distance between two paired-end reads on the cDNA.
pair 1 ------------->
cDNA ----------------------------------------
pair 2 <---------------
<-------->
inter-pair distance
or is it the size of the cDNA between the sequencing adapters ?
adapter 5' -------------
cDNA -----------------------
adapter 3' ---------------
Library ---------------------------------------------------
And anyone can also explain me these two params :
--closure-search Enables the mate pair closure-based search for junctions. Closure-based search should only be used when the expected inner distance between mates is small (<= 50bp)
--coverage-search Enables the coverage based search for junctions. Use when coverage search is disabled by default (such as for reads 75bp or longer), for maximum sensitivity.
Thanks a lot !
N.
As of 1.3.2 -r is no longer required. "Deprecated -r as a required parameter (defaults to 50)" from the release notes. I think the manual is out of date.
Regarding the second parameter: In eukaryotes, mRNAs undergo splicing. The regions of RNA that are included in mature mRNA are exons. When transcriptome is sequenced, some reads arise from single exon and some arise from more than one exons. Ones arising from more than one exon are junction reads. Junction reads are searched via
--coverage-search
option. It is on by default. But it can be time consuming so it can be disabled using--no-coverage-search
.