Warning with cuffmerge: couldn't find fasta record
0
0
Entering edit mode
7.3 years ago

Hi im trying to run cuffmerge with my gtf anotation using the next command

cuffmerge -p 8 -g TAIR_gff/TAIR10_GFF3_genes.gff -s TAIR10_chr_all_new.fa assemblies.txt

But I get the following output message:

[Tue Aug  1 12:14:09 2017] Quantitating transcripts
Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g TAIR_gff/TAIR10_GFF3_genes.gff -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_filesuXYVC 
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_filesuXYVC doesn't appear to be a valid BAM file, trying SAM...
[12:14:10] Loading reference annotation.
[12:14:12] Inspecting reads and determining fragment length distribution.
Processed 48523 loci.                       
Map Properties:
Normalized Map Mass: 3977199.00Raw Map Mass: 3977199.00
>   Fragment Length Distribution: Truncated Gaussian (default)
>                 Default Mean: 200
>              Default Std Dev: 80
[12:14:30] Assembling transcripts and estimating abundances.
Processed 48523 loci.                       
[Tue Aug  1 12:15:34 2017] Comparing against reference file TAIR_gff/TAIR10_GFF3_genes.gff
Warning: Could not connect to update server to verify current version. Please check at the Cufflinks     website (http://cufflinks.cbcb.umd.edu).
Warning: couldn't find fasta record for '1'!
Warning: couldn't find fasta record for '2'!
Warning: couldn't find fasta record for '3'!
Warning: couldn't find fasta record for '4'!
Warning: couldn't find fasta record for '5'!
Warning: couldn't find fasta record for 'chloroplast'!
Warning: couldn't find fasta record for 'mitochondria'!
[Tue Aug  1 12:15:49 2017] Comparing against reference file TAIR_gff/TAIR10_GFF3_genes.gff
Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
Warning: couldn't find fasta record for '1'!
Warning: couldn't find fasta record for '2'!
Warning: couldn't find fasta record for '3'!
Warning: couldn't find fasta record for '4'!
Warning: couldn't find fasta record for '5'!
Warning: couldn't find fasta record for 'chloroplast'!
Warning: couldn't find fasta record for 'mitochondria'!

I checked the structure of my gtf and my file fasta:

awk -F "\t" '{print $1}' TAIR10_GFF3_genes.gff | uniq

Chr1
Chr2
Chr3
Chr4
Chr5
ChrC
ChrM

My fasta structure

    grep '>' TAIR10_chr_all_new.fa

    >Chr1
    >Chr2
    >Chr3
    >Chr4
    >Chr5
    >ChrC
    >ChrM

It seems that both the gtf and the fasta file have the same structure. You could guide me to find the solution?

Assembly Cufflinks Cuffmerge • 4.5k views
ADD COMMENT
0
Entering edit mode

Just a guess, but I think that the files that you have listed in your assemblies.txt file are causing the problem.

How did you produce your indiviual transcripts.gtf files that you are trying to merge? It looks like they were aligned to a different reference.

Also check that you have index your own genome (TAIR10_chr_all_new.fa) and are not using an index that you downloaded somewhere else

ADD REPLY
0
Entering edit mode

You have a mismatch in chromosome names that you have used for cufflinks assembly. Please, check the cufflinks command to assess the reference files you have used for assembly. The assembled gtf files has chromosomes name as '1' instead of "Chr1"

ADD REPLY

Login before adding your answer.

Traffic: 2711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6