Hi , I am using the genes.gtf and genome.fa files from iGenomes for Homo sapiens Ensemble GRCh37, for cuffcompare and cuffmerge. But I get the following warnings: (I got many warnings but pasting few of them to avoid long message)
Warning: couldn't find fasta record for 'GL000191.1'!
Warning: couldn't find fasta record for 'GL000192.1'!
Warning: couldn't find fasta record for 'GL000193.1'!
Warning: couldn't find fasta record for 'GL000194.1'!
Warning: couldn't find fasta record for 'GL000195.1'!
Warning: couldn't find fasta record for 'GL000196.1'!
I have seen the previous posts with similar questions but I am not sure why do I get these warnings if I have used .gtf and genome.fa from same set of files provided by iGenomes. The command I ran is below:
cuffcompare -r /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf -s /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa -o testcuffcomp test1.gtf test2.gtf
The version of cufflinks I am using is 2.2.1. Though I get an output file generated but I am not sure if its complete or truncated because of these warnings.
Any help with this will be greatly appreciated. Thanks.
I extracted the chromosome name column from the gtf file by : cut -f 1 genes.gtf | sort | uniq
And the list has all the contig names listed in the warning messages. And there are no fasta files provided for those contigs in the genomes folder of iGenomes. Was wondering why are the gtf files inclusive of these contigs if the fasta files are not provided for those contigs. I presume I can go ahead with the cuffdiff analysis in spite of these warnings because skipping these contig files would not affect any analysis. Any thoughts?
Yes. You could clean your GTF to keep only the chromosomes/contigs present in your fasta file, such that these warnings will disappear and the analysis would be
clean
, instead of going ahead with warnings.I have truncated my message so that it is not too long for people to read it. Hopefully I may get some suggestions now. Thanks for your help.