Im running cuffmerge to merge 3 gtf files with a simple command ($ cuffmerge assemblies.txt) but I get the following error:
[Sat Jul 9 01:14:33 2016] Beginning transcriptome assembly merge
[Sat Jul 9 01:14:33 2016] Preparing output location ./merged_asm/ Warning: no reference GTF provided! [Sat Jul 9 01:14:33 2016] Converting GTF files to SAM [01:14:33] Loading reference annotation. Error: duplicate GFF ID 'CUFF.27.1' encountered! [FAILED] Error: could not execute gtf_to_sam
I opened the 3 GFF files in Excel and sorted them by geneID (column I). This showed me that in just one of the files there are 2 transcripts with the same transcript ID of CUFF.27.1
chr1 Cufflinks transcript 48697775 48698538 1000 . . gene_id "CUFF.27"; transcript_id "CUFF.27.1"; FPKM "1.6480365182"; frac "1.028571"; conf_lo "0.000000"; conf_hi "4.215554"; cov "1.309841"; full_read_support "yes";
chr1 Cufflinks transcript 2296353 2304639 1000 - . gene_id "CUFF.27"; transcript_id "CUFF.27.1"; FPKM "6.2346653714"; frac "1.000000"; conf_lo "3.015098"; conf_hi "9.454233"; cov "4.988501"; full_read_support "no";
Ive read the previous posts but none have a definitive answer - many say it is a bug with cufflinks output. Has anyone got a fix for this? I would rather not edit the file.
Thanks in advance, Kenneth.
You should provide both the
cufflinks
andcuffmerge
commands that you used.I think I have already solved this igor but here are the commands used:
https://s31.postimg.org/s4phhy1jf/Screenshot_1.png
Great, but you should use ref annotations if those are available.
Thanks Igor - I have just posted another question if you would be so kind as to have a stab at it: From cuffmerge to cuffdiff: What input for cuffdiff?