I have run Cufflink on MapSplice output alignment and have got 3 output files. my question is that why im getting 0 FPKM values for some fields but not for other while the last column showing the status of FPKM value is showin OK. further why some fields (like tracking ID, class code etc) for the files isofoms.fpkm-tracking and gene.fpkm_tracking are missing. the command line used and snapshot of out files are given below:
./cufflinks -o /data/memona/cufflinks-2.2.1.Linux_x86_64/result/ -p 64 -G /data/memona/annotation/arboreum.gff3 -b /data/memona/reference/chromosome.fa -u /data/memona/cufflinks-2.2.1.Linux_x86_64/alignmentMap_sorted.sam
isoforms.fpkm_tracking
Cotton_A_36275_BGI-A2_v1.0 - - - - chr9:95929518-95929903 243 0 0 0 0 OK
Cotton_A_40148_BGI-A2_v1.0 - - - - chr9:96215509-96219630 783 0 0 0 0 OK
Cotton_A_36277_BGI-A2_v1.0 - - - - chr9:95673000-95685403 627 12.724 4.98702 3.29366 6.68038 OK
Cotton_A_11823_BGI-A2_v1.0 - - - - chr9:94356880-94359955 3075 0.471196 0.173075 0 0.346234 OK
Cotton_A_36278_BGI-A2_v1.0 - - - - chr9:95593880-95595053 1173 0.203524 0.0745688 0 0.223706 OK
genes.fpkm_tracking
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
- - - - chr1:84155-84764 - - 0 0 0 OK
- - - - chr1:94826-95120 - - 0 0 0 OK
- - - - chr1:303007-304950 - - 0 0 0 OK
- - - - chr1:334913-336019 - - 0 0 0 OK
- - - - chr1:569413-577498 - - 0 0 0 OK
- - - - chr1:545579-546643 - - 0 0 0 OK
- - - - chr1:328908-331176 - - 34.7035 30.4456 38.9615 OK
transcripts.gtf
chr1 Cufflinks transcript 84156 84764 1 - . gene_id ""; transcript_id "Cotton_A_10375_BGI-A2_v1.0"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1 Cufflinks exon 84156 84764 1 - . gene_id ""; transcript_id "Cotton_A_10375_BGI-A2_v1.0"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1 Cufflinks transcript 94827 95120 1 + . gene_id ""; transcript_id "Cotton_A_10374_BGI-A2_v1.0"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1 Cufflinks exon 94827 95120 1 + . gene_id ""; transcript_id "Cotton_A_10374_BGI-A2_v1.0"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
Hi Michael, thanks for the prompt response. Can you plz suggest which softwares are the best and currently being suggested to align RNAseq data to determine splice junctions and what are replacements of cufflink software besides String Tie???
further I have checked the manual of cufflinks and it is suggesting as
-G/--GTF <reference_annotation.(gtf gff)><="" strong="">
*for me it means that it accepts both gtf and gff file format.*
few line of annotation file are as follows:**
Hi,
There is a difference between Cufflinks mentioned gff (should be gff2) and a gff3 file. Since you named your file.gff3, i was suggesting to use rather gtf. In your GFF file, there is no aggregation on gene level. Thus, Cufflinks cannot make a connection between an isoform and a gene. You may try the "-g" (--GTF-guide) option rather than the "-G" (--GTF). This will create a guided assembly of your known isoforms into gene loci.
Cheers,
Michael