Hi:
I am running tophat+cufflinks to analyze a RNA-seq dataset downloaded from GEO[1]
Below is my procedure. (I run fastq-dump to convert the sra file into fastq.) but I found the FPKM values are all 0 in my resulting transcripts.gtf file.
Could you please give me some tips/advices on how to calculate the FPKM values ?
Thanks a lot!
[1] ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR306/SRR306758/SRR306758.sra
tophat mm9 SRR306758.fastq
[Mon Jan 5 14:23:52 2015] Beginning TopHat run (v1.3.1)
-----------------------------------------------
[Mon Jan 5 14:23:52 2015] Preparing output location ./tophat_out/
[Mon Jan 5 14:23:52 2015] Checking for Bowtie index files
[Mon Jan 5 14:23:52 2015] Checking for reference FASTA file
Warning: Could not find FASTA file mm9.fa
[Mon Jan 5 14:23:52 2015] Reconstituting reference FASTA file from Bowtie index
Executing: /genome/bin/bowtie-inspect mm9 > ./tophat_out/tmp/mm9.fa
[Mon Jan 5 14:26:32 2015] Checking for Bowtie
Bowtie version: 0.12.7.0
[Mon Jan 5 14:26:32 2015] Checking for Samtools
Samtools Version: 0.1.19
[Mon Jan 5 14:26:32 2015] Generating SAM header for mm9
[Mon Jan 5 14:27:03 2015] Preparing reads
format: fastq
quality scale: phred33 (default)
Left reads: min. length=76, count=18238434
[Mon Jan 5 14:33:01 2015] Mapping left_kept_reads against mm9 with Bowtie
[Mon Jan 5 15:37:36 2015] Processing bowtie hits
[Mon Jan 5 15:50:07 2015] Mapping left_kept_reads_seg1 against mm9 with Bowtie (1/3)
[Mon Jan 5 16:59:49 2015] Mapping left_kept_reads_seg2 against mm9 with Bowtie (2/3)
[Mon Jan 5 18:15:05 2015] Mapping left_kept_reads_seg3 against mm9 with Bowtie (3/3)
[Mon Jan 5 19:11:27 2015] Searching for junctions via segment mapping
[Mon Jan 5 19:17:48 2015] Retrieving sequences for splices
[Mon Jan 5 19:21:26 2015] Indexing splices
[Mon Jan 5 19:21:44 2015] Mapping left_kept_reads_seg1 against segment_juncs with Bowtie (1/3)
[Mon Jan 5 19:31:38 2015] Mapping left_kept_reads_seg2 against segment_juncs with Bowtie (2/3)
[Mon Jan 5 19:41:48 2015] Mapping left_kept_reads_seg3 against segment_juncs with Bowtie (3/3)
[Mon Jan 5 19:50:39 2015] Joining segment hits
[Mon Jan 5 19:59:59 2015] Reporting output tracks
-----------------------------------------------
Run complete [05:46:13 elapsed]
cufflinks accepted_hits.bam -G /disk4/linp/brawand_Data/Mus_musculus.NCBIM37.60.gtf -o SRR306758_trial_1
You are using Cufflinks v2.2.1, which is the most recent release.
[09:45:41] Loading reference annotation.
[09:45:55] Inspecting reads and determining fragment length distribution.
> Processed 27910 loci. [*************************] 100%
> Map Properties:
> Normalized Map Mass: 13631307.57
> Raw Map Mass: 13631307.57
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[09:46:29] Estimating transcript abundances.
> Processing Locus 8:122386172-122400888 [************************ ] 99%
8:122434259-126869624 Warning: Skipping large bundle.
> Processed 27909 loci. [*************************] 100%
Can you check the alignment statistic? E.g. how many reads are mapped to the genome? Have you try to ask cufflink to also output the raw read counts?
Can you check
accepted_hits.bam
file and see that you mapped a reasonable number of reads?Have you tried using cufflinks
-g
? Have you checked that the names in your reference gtf file are EXACTLY the same of those you used for mapping on? Otherwise you will not map anything (beacuse you are using-G
)