Hello,
I used STAR to make index.
Pavilion-Desktop-590-p0xxx:~/STAR-2.7.0e/source$ ./STAR --runThreadN 30 --runMode genomeGenerate --genomeDir /home/li/genome/ --genomeFastaFiles /home/li/genome/UMD3.1_chromosomes.fa --sjdbGTFfile /home/li/genome/Bos_taurus.ARS-UCD1.2.98.chr.gtf --sjdbOverhang 99
Oct 12 22:50:52 ..... started STAR run
Oct 12 22:50:52 ... starting to generate Genome files
Oct 12 22:51:40 ... starting to sort Suffix Array. This may take a long time...
Oct 12 22:51:51 ... sorting Suffix Array chunks and saving them to disk...
Oct 12 23:22:15 ... loading chunks from disk, packing SA...
Oct 12 23:35:59 ... finished generating suffix array
Oct 12 23:35:59 ... generating Suffix Array index
Oct 12 23:49:34 ... completed Suffix Array index
Oct 12 23:49:34 ..... processing annotations GTF
Fatal INPUT FILE error, no valid exon lines in the GTF file: /home/li/genome/Bos_taurus.ARS-UCD1.2.98.chr.gtf Solution: check the formatting of the GTF file. Most likely cause is the difference in chromosome naming between GTF and FASTA file.
Oct 12 23:49:48 ...... FATAL ERROR, exiting
Thanks in advance for great help!
Best,
Yue
Did you get your FASTA and GTF files from the same source?
' Most likely cause is the difference in chromosome naming between GTF and FASTA file.'
Just check the chr names in GTF and FASTA files first...
Thanks shoujin.gu,
the names in GTF and FASTA files:
of vilion-Desktop-590-p0xxx:~/STAR-2.7.0e/source$ ./STAR --runThreadN 30 --runMode genomeGenerate --genomeDir /home/li/genome/ --genomeFastaFiles /home/li/genome/GCF_002263795.1_ARS-UCD1.2_refseq_chrids.fa --sjdbGTFfile /home/li/genome/Bos_taurus.ARS-UCD1.2.98.chr.gtf --sjdbOverhang 99
Oct 13 22:03:30 ..... started STAR run
Oct 13 22:03:31 ... starting to generate Genome files
Oct 13 22:04:19 ... starting to sort Suffix Array. This may take a long time...
Oct 13 22:04:29 ... sorting Suffix Array chunks and saving them to disk...
Oct 13 22:25:05 ... loading chunks from disk, packing SA...
Oct 13 22:38:34 ... finished generating suffix array
Oct 13 22:38:34 ... generating Suffix Array index
Oct 13 22:52:45 ... completed Suffix Array index
Oct 13 22:52:45 ..... processing annotations GTF
Oct 13 22:52:57 ..... inserting junctions into the genome indices
Oct 14 01:09:48 ... writing Genome to disk ...
Oct 14 01:13:02 ... writing Suffix Array to disk ...
Oct 14 02:44:31 ... writing SAindex to disk
Oct 14 02:44:51 ..... finished successfully
I mean take a look at the name of chromosome within each file.... whether they are consistent, such as all are they all like 'Chr 1'? All are they all like '1'?
Hello, shoujun.gu,
Thank you for your message!
I should download the data from:
https://bovinegenome.elsiklab.missouri.edu/node/68,
GCF_002263795.1_ARS-UCD1.2_refseq_chrids.fa.gz
ftp://ftp.ensembl.org/pub/current_gtf/bos_taurus/
Bos_taurus.ARS-UCD1.2.98.chr.gtf.gz
Thank you again!
yue
Seems your data come from different sources, their chromosome labels maybe different. This may cause the program error. Try to download your fa and gft file from the same website.