Loading reference annotation.
GFF Error: duplicate/invalid 'transcript' feature ID=id350455
[FAILED]
Error: could not execute gtf_to_sam
Seems there are more than one id have duplicates, so I tried used sort -u, but doesn't work because transcripts id is in field 9 separated by ; with gene ids; fpkm etc. And most of transcripts id have more than one duplicates.
Have you encounter such problems before? I was stuck here and need your help.
For the record, had your error ("GFF Error: duplicate/invalid 'transcript' feature ID=") and was searching all over the internet and couldn't find the answer.
For me, it turned out to be that there was nothing wrong with my reference gtf (which was iGenomes UCSSC hg19).
I am fairly sure the problem was that I was running multiple cuffmerge runs at the same time in the same working directory, which I am guessing means that they were all writing to the same temporary files and this caused problems. When I ran them in different working directories, the problem disappeared.
Thanks a lot for your reply, yes exactly, I tried all kinds of ways, but seems like every round of cufflinks (I was using -G newest bovine genome UMD3.1.1) will generate some multiple transcripts id duplicates (they share the same transcripts id but exons all marked as "exon1", correct should be "exon1" to "exon10"), so I go back to use genome version UMD3.1, and cufflinks runs well. That just the weirdest thing ever.
Best,
Ellie
ADD REPLY
• link
updated 5.0 years ago by
Ram
44k
•
written 9.2 years ago by
Jingyue
▴
70
I wanted to use a pipeline that had previously been working on other samples on Galaxy main but faced the same GFF Error: duplicate/invalid 'transcript' that you describe at the Cuffmerge/Cuffcompare step.
I got around the problem by updating all the tools in the pipeline. Specifically, it appear that the problem occurs when trying to use the latest Cuffmerge version on data generated with older versions of Bowtie.
Hi, Kanne,
Thanks a lot for your reply, yes exactly, I tried all kinds of ways, but seems like every round of cufflinks (I was using -G newest bovine genome UMD3.1.1) will generate some multiple transcripts id duplicates (they share the same transcripts id but exons all marked as "exon1", correct should be "exon1" to "exon10"), so I go back to use genome version UMD3.1, and cufflinks runs well. That just the weirdest thing ever.
Best,
Ellie