Hi,
I have 12 paired-end fq files.
In tophat2 protocol this is the syntax
tophat -p 8 -G genes.gtf genome file-1.fq file-2.fq
but I read something about r option in paired-end case like below
tophat -r 200 file-1.fq file-2.fq
I am going to start using tophat, then please someone tell me if I should use -r
or not? If I should use, then why in protocol Trapnell et al. didn't mentioned anything about -r
?
Thank you
Thank you but how I should know the length of each end or my fragments? Also I read something about comma separation but there is no comma in tophat manual
Use the mean length from your bioanalyser QC prior to sequencing (or similar) minus the sequencing length that you used (i.e. 2x 100bp)
Actually I don't have any information about the protocol then I used 200
I think you need to find out about how your experiment was made before you do any more analysis then, otherwise the results could be useless.
For example, if you sonicated your samples to 300bp and did 150bp Paired End sequencing your insert (
-r
) is 0.If you sonicated your samples to 600bp and did 75bp Paired End sequencing your insert (
-r
) is 450.You MUST know this before just choosing a number at random. If you don't have information on the protocol, find out the information.
Thank you, I received an email from company, the insert size was 248-277 bp then I used 200. Did I do something wrong?
You don't need me to tell you that 200 does not equal 248-277. However I don't know how strictly tophat uses the parameter. You'll have to read up on that to see if it could affect your alignment.
Then when the distribution of my insert size between samples is 248-277, what you suggest me to set?
Apologies if this is blunt, but please read the manual and stop asking me to read the manual for you.
ok thank you