Entering edit mode
8.4 years ago
Pei
▴
220
Hi all:
in tophat work, there was a step: Searching for junctions via segment mapping, which was run by segment_juncs
however, when dealing with some mouse data, I found that this step was finished in about 2 minutes, which would be longer when processing other data. it seemed that this step was skipped.
Do you know what happened ?
anyway, the tophat job could be finished without error-reporting, but the resulting mapping rate was low, ~45%.
Thanks in advance! Best
Can you show the commands that you used for these datasets?
Thank you.
my cmd was:
the data I used was downloaded from NCBI: GSE30352
Please use
ADD REPLY/ADD COMMENT
to provide additional information for existing comments/posts. ReserveSUBMIT ANSWER
for new answers for the original question.Can you check the tophat_out folder and look for junction.bed file? If its empty then its skipped. If its not, then the step is not skipped.
not empty. That file had 141909 lines, and the step "Searching for junctions via segment mapping" takes about 5 minutes However, in another dataset SRR1915443, this junctions.bed had 143339 lines, and the step "Searching for junctions via segment mapping" takes more than 1 hours.
both were using mm10. SRR306769 had 45613332 reads, while SRR1915443 had 62575245 reads.
Just saw the samples. They differ in read length (50 vs 76) and quality (SRR1915443 has a good read quality). So number of reads that will pass QC will be different in each sample. If you have used default setting (segment length 25) then, the two samples will differ in number of segments per read (2 vs 3). These factors may have contributed to difference in the time it took for mapping junction reads.