cufflink and cuffmerge error
2
2
Entering edit mode
10.4 years ago
BDK_compbio ▴ 140

After running cufflink for two samples of RNA-Seq data, I used cuffmerge and executed the following command

cuffmerge -p 8 -g <directory>/<gff file>   -s <directory>/<refernec fasta file>  <directory>/assemblies.txt

Where assemblies.txt contains the transcripts.gtf

But I am getting following error

[Sun Jul 13 14:44:46 2014] Beginning transcriptome assembly merge
-------------------------------------------

[Sun Jul 13 14:44:46 2014] Preparing output location ./merged_asm/
[Sun Jul 13 14:46:04 2014] Converting GTF files to SAM
[14:46:04] Loading reference annotation.
[14:47:10] Loading reference annotation.
[14:48:06] Loading reference annotation.
[14:49:23] Loading reference annotation.
[14:50:29] Loading reference annotation.
[14:51:44] Loading reference annotation.
[Sun Jul 13 14:52:46 2014] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g <directory>/<gff file> -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_file3DKFYb 
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_file3DKFYb doesn't appear to be a valid BAM file, trying SAM...
[14:52:49] Loading reference annotation.
Error parsing strand (?) from GFF line:
IWGSC_CSS_1DS_scaff_731014      .       repeat_region   1       174     .       ?    .Name=trf;class=trf;repeat_consensus=ATTGGTATAGAACGCATGAAGAAACTCCATACAGATGGATCTTTAGACTCACTCAATTATGAAAAAATTGAGACATGCAAACCATGTCT;type=Tandem repeats
        [FAILED]
Error: could not execute cufflinks
cufflink cuffmerge • 8.9k views
ADD COMMENT
2
Entering edit mode
10.4 years ago
komal.rathi ★ 4.1k

You are using the wrong input file. You downloaded the GFF3 instead of GTF file. The correct file is: ftp://ftp.ensemblgenomes.org/pub/plants/release-22/gtf/triticum_aestivum/Triticum_aestivum.IWGSP1.22.gtf.gz

This is your GFF3 file:

##gff-version 3
IWGSC_CSS_5BS_scaff_1034127    .    repeat_region    61    200    .    +    .    Name=gnl|TREP|TREP3026;class=Unknown;repeat_consensus=N;type=Unknown
IWGSC_CSS_3AL_scaff_747250    .    repeat_region    2    200    .    +    .    Name=gnl|TREP|TREP765;class=Unknown;repeat_consensus=N;type=Unknown
IWGSC_CSS_3B_scaff_7107049    .    repeat_region    1    200    .    +    .    Name=gnl|TREP|TREP232;class=Unknown;repeat_consensus=N;type=Unknown

This is the GTF file:

IWGSC_CSS_6DL_scaff_127793    protein_coding    exon    526    645    .    +    .     gene_id "Traes_6DL_7FFFE462C"; transcript_id "Traes_6DL_7FFFE462C.2"; exon_number "1"; seqedit "false";
IWGSC_CSS_6DL_scaff_127793    protein_coding    CDS    574    645    .    +    0     gene_id "Traes_6DL_7FFFE462C"; transcript_id "Traes_6DL_7FFFE462C.2"; exon_number "1"; protein_id "Traes_6DL_7FFFE462C.2";
ADD COMMENT
0
Entering edit mode

Thanks a lot. Yes, I already started running the script using GTF file.

ADD REPLY
0
Entering edit mode

sbdk82 whenever you figure out the solution to your question before others, please post it here as an answer or accept an answer that other people post. So that people can focus on other 'open' questions. Thanks!

ADD REPLY
0
Entering edit mode

Yes, I was about to do that but the script was running and I was not sure if using GTF file solved that issue.

ADD REPLY
0
Entering edit mode

Hi

I am new in Rna-seq... I am trying to run the cuffmerge but ir is giving me some error... Please, could you help me...

cuffmerge -g hg19.gtf -s hg19.fa -p 8 assemblies.txt

[Sun Jan 25 22:51:41 2015] Beginning transcriptome assembly merge
-------------------------------------------

[Sun Jan 25 22:51:41 2015] Preparing output location ./merged_asm/
[Sun Jan 25 22:51:48 2015] Converting GTF files to SAM
[22:51:48] Loading reference annotation.
[22:51:49] Loading reference annotation.
[22:51:51] Loading reference annotation.
[22:51:53] Loading reference annotation.
[Sun Jan 25 22:51:55 2015] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g hg19.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_fileZZUtV7
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileZZUtV7 doesn't appear to be a valid BAM file, trying SAM...
[22:51:55] Loading reference annotation.
    [FAILED]
Error: could not execute cufflinks
ADD REPLY
0
Entering edit mode

It seems there is some error in the GTF file. Please check if it is the correct one.

ADD REPLY
0
Entering edit mode

Hi sbdk82 and komal.rathi;

I checked the gtf file too. I changed my gtf file also and tried once again but still same error its giving.

ADD REPLY
0
Entering edit mode

Does assembly.txt contain all the transcript.gtf files? Did you run tophat and cufflinks before running cuffmerge? Try sorting all the SAM/BAM files before running cufflinks.

ADD REPLY
0
Entering edit mode

Yeah, I got the output from tophat and cufflinks, and the assembly.txt file has all the transcript.gtf files. If I run cuffmerge the following error has occurred..

[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileMgq1Io doesn't appear to be a valid BAM file, trying SAM...
[22:40:59] Loading reference annotation.
    [FAILED]
Error: could not execute cufflinks
ADD REPLY
0
Entering edit mode

nikhilvgbt Sorry for the delayed response. Did you get things worked out?

ADD REPLY
0
Entering edit mode

Yeah, it worked for me. The problem in the bam file is just because there is no proper order of Chromosomes in Reference genome and Gtf file. Thank you komal.rathi

and komal.rathi, how did you solve this error? So that other people may have an answer for this.

ADD REPLY
1
Entering edit mode
10.4 years ago
Josh Herr 5.8k

Your GFF file is not being read by cufflinks.

You didn't give us much information to help you, but it looks like it might be a formatting error. You just need to make sure you do not have any mis-aligned columns or extra line endings that may be in your file.

ADD COMMENT
0
Entering edit mode

Hi Josh,

I used the following gff file and reference file.

ADD REPLY

Login before adding your answer.

Traffic: 1545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6