Hi everyone,
I am trying to generate genome indexes with STAR to align my RNAseq data,
with this command line:
/data/software/STAR/source/STAR --runThreadN 16 --runMode genomeGenerate --genomeDir star_genome3/ --genomeFastaFiles Pvulgaris_442_v2.0.fa --sjdbGTFfile phavu.G19833.gnm2.ann1.PB8d.gene_exons.gff3 --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript Parent --genomeChrBinNbits 18 --sjdbOverhang 100
but after 5 min it ends, I think that have problem with such speedy.
The list output of it is here:
Genome SAindex chrName.txt chrStart.txt exonInfo.tab genomeParameters.txt sjdbList.fromGTF.out.tab transcriptInfo.tab SA chrLength.txt chrNameLength.txt exonGeTrInfo.tab geneInfo.tab sjdbInfo.txt sjdbList.out.tab
Then I change runmod to alignment with this script:
/data/software/STAR/source/STAR --runMode alignReads --genomeDir /data/mshoorooei/star_genome4/ --runThreadN 16 --outFilterMismatchNmax 2 --readFilesIn PE_27_F.fq.gz PE_27_R.fq.gz --readFilesCommand gunzip -c --outFileNamePrefix 27_ --outReadsUnmapped unmapped_27 --outSAMtype BAM SortedByCoordinate
Output is here:
27_Aligned.sortedByCoord.out.bam
27_Log.final.out
27_Log.out
27_Log.progress.out
27_SJ.out.tab
unfortunately, this gives me the same problem too.
Do you have any idea? thanks for your suggestions.
thanks for your comment, My genome is nearly 600 Mb, how can I understand Log.progress.out and Log.final.out is right??
You should watch the output of STAR while it is running, during genome generation it should output:
This was for a 680MBase genome in 33000 scaffolds, and 120 CPUs but I don't think multi-core helps much during genome generate.
During alignment it should output something like:
Using Log.final.out you can then compare the number of input sequences with the number of sequences in your input file (they should be the same of course) and the mapping rate (90%+ is common for good data)
It is running during genome generation. it seems the same.
The running star during alignment.
So there is no obvious error. Your genome generation is faster than ours, but this is probably IO related.
Okay, thanks so much for your help.