Hi all,
I am trying to analyze my RNA-seq data with RSEM by using STAR aligner.
For this I downloaded the gtf file, hg19 genome and is using the following code
rsem-prepare-reference --gtf hg19-iGenomes.gtf --star /data/RSEM/hg19 hg19_rsem
based on RSEM usage "rsem-prepare-reference [options] reference_fasta_file(s) reference_name"
where "/data/RSEM/hg19" contains the .fa file (fasta files) for all the chromosomes and hg_rsem is the reference name or --outFileNamePrefix.
It is generating the output files hg19_rsem.n2g.idx.fa, .idx.fa, .seq, .transcript.fa, .chrlist, grp, ti and chrLength.txt, chrName.txt, chrNameLength.txt, chrStart.txt, genomeParameters.txt, hg19_rsemLog.out, hg19_rsem_STARtmp.
However, it is ending with:
Aug 22 13:34:46 ..... started STAR run
Aug 22 13:34:46 ... starting to generate Genome files
Aug 22 13:35:57 ... starting to sort Suffix Array. This may take a long time...
Aug 22 13:36:22 ... sorting Suffix Array chunks and saving them to disk...
"STAR --runThreadN 1 --runMode genomeGenerate --genomeDir . --genomeFastaFiles /data//RSEM/hg19/chr1.fa, ........ /data/RSEM/hg19/chrY.fa --sjdbGTFfile hg19-iGenomes.gtf --sjdbOverhang 100 --outFileNamePrefix hg19_rsem" failed! Plase check if you provide correct parameters/options for the pipeline!
I am unable to understand where I am going wrong. Please help me sort this out.
Thanks Debbie