TopHat Failed In 'Reporting output tracks' Stage
1
1
Entering edit mode
8.0 years ago

Hello everyone!

I've encountered the following problem while doing study on EncodeProject data, maybe someone here would be able to give me an advice!

I've downloaded raw sequencing data for library ENCLB555APY from https://www.encodeproject.org/experiments/ENCSR000CPY/ and tried to map it on human genome downloaded from ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/UCSC/hg38/Homo_sapiens_UCSC_hg38.tar.gz.

Afterwards I've tried to use TopHat v2.1.0 as following

tophat2 -p 8 --b2-very-sensitive -o tophat_res/ENCLB555APY/ Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome  ENCFF000HGG.fastq.gz ENCFF000HHF.fastq.gz

and it failed with the following tophat.log with Error which I failed to Google:

[2016-12-03 17:27:10] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2016-12-03 17:27:10] Checking for Bowtie
                  Bowtie version:        2.2.6.0
[2016-12-03 17:27:10] Checking for Bowtie index files (genome)..
[2016-12-03 17:27:10] Checking for reference FASTA file
[2016-12-03 17:27:10] Generating SAM header for Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome
[2016-12-03 17:27:12] Preparing reads
         left reads: min. length=76, max. length=76, 133266131 kept reads (283399 discarded)
        right reads: min. length=76, max. length=76, 133088200 kept reads (461330 discarded)
[2016-12-03 18:28:22] Mapping left_kept_reads to genome genome with Bowtie2
[2016-12-04 03:31:38] Mapping left_kept_reads_seg1 to genome genome with Bowtie2 (1/3)
[2016-12-04 03:42:39] Mapping left_kept_reads_seg2 to genome genome with Bowtie2 (2/3)
[2016-12-04 03:53:31] Mapping left_kept_reads_seg3 to genome genome with Bowtie2 (3/3)
[2016-12-04 04:07:24] Mapping right_kept_reads to genome genome with Bowtie2
[2016-12-04 12:42:21] Mapping right_kept_reads_seg1 to genome genome with Bowtie2 (1/3)
[2016-12-04 13:04:21] Mapping right_kept_reads_seg2 to genome genome with Bowtie2 (2/3)
[2016-12-04 13:24:26] Mapping right_kept_reads_seg3 to genome genome with Bowtie2 (3/3)
[2016-12-04 13:48:17] Searching for junctions via segment mapping
[2016-12-04 14:04:52] Retrieving sequences for splices
[2016-12-04 14:06:20] Indexing splices
[2016-12-04 14:06:53] Mapping left_kept_reads_seg1 to genome segment_juncs with Bowtie2 (1/3)
[2016-12-04 14:10:07] Mapping left_kept_reads_seg2 to genome segment_juncs with Bowtie2 (2/3)
[2016-12-04 14:13:25] Mapping left_kept_reads_seg3 to genome segment_juncs with Bowtie2 (3/3)
[2016-12-04 14:16:18] Joining segment hits
[2016-12-04 14:20:55] Mapping right_kept_reads_seg1 to genome segment_juncs with Bowtie2 (1/3)
[2016-12-04 14:29:53] Mapping right_kept_reads_seg2 to genome segment_juncs with Bowtie2 (2/3)
[2016-12-04 14:36:11] Mapping right_kept_reads_seg3 to genome segment_juncs with Bowtie2 (3/3)
[2016-12-04 14:40:37] Joining segment hits
[2016-12-04 14:47:37] Reporting output tracks
        [FAILED]
Error running /usr/bin/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir tophat_res/ENCLB555APY// --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p8 --inner-dist-mean 50 --inner-dist-std-dev 20 --no-closure-search --no-coverage-search --no-microexon-search --sam-header tophat_res/ENCLB555APY//tmp/genome_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/usr/bin/samtools_0.1.18 --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome.fa tophat_res/ENCLB555APY//junctions.bed tophat_res/ENCLB555APY//insertions.bed tophat_res/ENCLB555APY//deletions.bed tophat_res/ENCLB555APY//fusions.out tophat_res/ENCLB555APY//tmp/accepted_hits tophat_res/ENCLB555APY//tmp/left_kept_reads.mapped.bam,tophat_res/ENCLB555APY//tmp/left_kept_reads.candidates tophat_res/ENCLB555APY//tmp/left_kept_reads.bam tophat_res/ENCLB555APY//tmp/right_kept_reads.mapped.bam,tophat_res/ENCLB555APY//tmp/right_kept_reads.candidates tophat_res/ENCLB555APY//tmp/right_kept_reads.bam
Error: failed to retrieve right read for pair # 2037377 !

It looks like some error in input files but I would think it to be highly improbable. So what have I done wrong and is there any way to overcome this error without re-running tophat?

Thanks in advance,

Ivan

RNA-Seq rna-seq software error TopHat TopHat2 • 3.4k views
ADD COMMENT
0
Entering edit mode

Hello Ivan,

I got the same error. Could you please let me know how to fix it or what is the cause?

It is appreciated.

Zuolin Bai

ADD REPLY
0
Entering edit mode

You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLY
1
Entering edit mode
8.0 years ago
mastal511 ★ 2.1k

I've had tophat fail at the 'Reporting output tracks' stage because it ran out of memory, but I didn't get the error message you're getting.

ADD COMMENT
0
Entering edit mode

Thanks for response! It was a machine with 32Gb RAM and app. 900 Gb of free storage space, so I wouldn't expect any kind of memory problems. The 'failed to retrieve right read for pair' mistake puzzles me as well

ADD REPLY
0
Entering edit mode

Actually 32 Gb is not that much, seeing that you have 133 million read pairs. Check how much memory it is using while it runs.

ADD REPLY

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6