Entering edit mode
8.1 years ago
jolin0701-dy
▴
100
I just got an error from cuffmerge
$ ~/programs/cufflinks-2.1.1.OSX_x86_64/cuffmerge -g ~/GRCm38_86/mouse.gtf -s ~/GRCm38_86/mouse.fa assemblies.txt
[Mon Oct 10 19:28:48 2016] Beginning transcriptome assembly merge
-------------------------------------------
[Mon Oct 10 19:28:48 2016] Preparing output location ./merged_asm/
[Mon Oct 10 19:28:58 2016] Converting GTF files to SAM
[19:28:58] Loading reference annotation.
[19:28:59] Loading reference annotation.
[Mon Oct 10 19:29:02 2016] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g ~/GRCm38_86/mouse.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 ./merged_asm/tmp/mergeSam_tmp.2.PBd67n
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_tmp.2.PBd67n doesn't appear to be a valid BAM file, trying SAM...
[19:29:02] Loading reference annotation.
[19:29:18] Inspecting reads and determining fragment length distribution.
Processed 39590 loci.
> Map Properties:
> Normalized Map Mass: 105710.00
> Raw Map Mass: 105710.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[19:29:20] Assembling transcripts and estimating abundances.
8:119910359-124345724 Warning: Skipping large bundle.
Processed 39589 loci.
[Mon Oct 10 19:44:26 2016] Comparing against reference file ~/GRCm38_86/mouse.gtf
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Error: duplicate GFF ID 'ENSMUST00000105372' encountered!
[FAILED]
Error: could not execute cuffcompare
Any suggestions? Thanks so much~~
I would suggest you to use the latest version of Cufflinks, although it might not resolve your issue. Many researchers have experienced the same problem as you have. It will work if you delete such duplicate entries from mouse.gtf. Similar issue was resolved by dhir_kumar at seqanswers forum: http://seqanswers.com/forums/showthread.php?t=22692 using
Your GFF ID is also related to "Selenocysteine". I have looked at Mus_musculus.GRCm38.84.gtf annotation file, and found that there are 62 entries related to Selenocysteine. Probably the above solution will work for you too.