Small accepted-hits.bam file and all FPKM values is error
2
0
Entering edit mode
10.4 years ago
Ginsea Chen ▴ 140

Hi ALL

I used tophat and cufflinks to analyze my RNA-seq data, my operation is follows:

$ tophat -p 8 -G genes.gtf -o SRR123456 genome SRR123456_1.fq SRR123456_2.fq
$ cufflinks accepted-hits.bam -G genes.gtf -o result

The problem is FPKM values of all genes in genes.fpkm_tracking file is zero, and then I find accepted-hits.bam is very small about 1M.

I can't find any way to solve it, so I ask for your help.

my tophat.log file:

[2014-07-06 20:57:33] Beginning TopHat run (v2.0.11)
-----------------------------------------------
[2014-07-06 20:57:33] Checking for Bowtie
          Bowtie version:     2.2.3.0
[2014-07-06 20:57:33] Checking for Samtools
        Samtools version:     0.1.19.0
[2014-07-06 20:57:33] Checking for Bowtie index files (genome)..
[2014-07-06 20:57:33] Checking for reference FASTA file
[2014-07-06 20:57:33] Generating SAM header for genome
[2014-07-06 20:57:44] Reading known junctions from GTF file
[2014-07-06 20:57:47] Preparing reads
     left reads: min. length=76, max. length=76, 27091797 kept reads (1057227 discarded)
[2014-07-06 21:03:27] Building transcriptome data files SRR392837/tmp/genes
[2014-07-06 21:03:49] Building Bowtie index from genes.fa
[2014-07-06 21:15:00] Mapping left_kept_reads to transcriptome genes with Bowtie2 
[2014-07-06 21:51:47] Resuming TopHat pipeline with unmapped reads
[2014-07-06 21:51:47] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2 
[2014-07-07 00:03:34] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/3)
[2014-07-07 00:30:20] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/3)
[2014-07-07 00:35:32] Mapping left_kept_reads.m2g_um_seg3 to genome genome with Bowtie2 (3/3)
[2014-07-07 02:10:52] Searching for junctions via segment mapping
[2014-07-07 02:28:58] Retrieving sequences for splices
[2014-07-07 02:29:33] Indexing splices
[2014-07-07 02:29:54] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/3)
[2014-07-07 02:37:24] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/3)
[2014-07-07 02:41:00] Mapping left_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/3)
[2014-07-07 03:29:21] Joining segment hits
[2014-07-07 04:07:49] Reporting output tracks
-----------------------------------------------
[2014-07-07 04:13:56] A summary of the alignment counts can be found in SRR392837/align_summary.txt
[2014-07-07 04:13:56] Run complete: 07:16:22 elapsed
FPKM tophat RNA-Seq • 3.9k views
ADD COMMENT
0
Entering edit mode

Can you paste the content of the file: SRR392837/align_summary.txt

[2014-07-07 04:13:56] A summary of the alignment counts can be found in SRR392837/align_summary.txt
ADD REPLY
0
Entering edit mode

Hi Pandey

My align_summary.txt as follows:

Reads:
          Input     :  28149024
           Mapped   :      1868 ( 0.0% of input)
            of these:      1546 (82.8%) have multiple alignments (436 have >20)
 0.0% overall read mapping rate.
ADD REPLY
0
Entering edit mode

See that's the problem. You have almost no reads getting aligned to the reference genome. As a result, you see zero FPKM values for the genes. The low mapping rate can be due to

  1. using a wrong reference genome to align your data
  2. The library type for your data doesn't match with the default Tophat library type or the one you have selected.

Normally, 70-80% of input reads get mapped to the reference genome in case of mammalian genome.

ADD REPLY
0
Entering edit mode

Hi

I am sure that my RNA-seq data is correspondent with reference genome, so the error may be due to the second reason. Thank you ~

ADD REPLY

Login before adding your answer.

Traffic: 2268 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6