I have successfully used STAR to Align all the pair end reads for all the RNA seq samples with option --quantMode TranscriptomeSAM. I got the annotation files from Gencode (gff3 and gtf) and all went smooth.
The problem I have now is with RSEM.
As a start I ran rsem-prepare-reference to generate the indices for RSEM and it went smooth
then I ran this command
rsem-calculate-expression --bam --no-bam-output -p 10 --estimate-rspd --calc-ci --seed 12345 --paired-end “ path to the Aligned.toTranscriptome.out.bam file “ “path and name of the suffix used to create the indices for RSEM” “path and name of the suffix that will be used for the output files”
I first get this warning :
Warning: The SAM/BAM file declares less reference sequences (199166) than RSEM knows (199324)! Please make sure that you aligned your reads against transcript sequences
Then at the end of the log file I get these as well :
"Fragment 48304137 is hung over the end of transcript 121465! It is possible that the aligner you use gave different read lengths for a same read in SAM file. The alignment of fragment 55086151 to transcript 135785 starts at -440 from the forward direction, which should be a non-negative number! It is possible that the aligner you use gave different read lengths for a same read in SAM file. Found unknown sequence letter at function get_rbase_id! "rsem-run-em /home/saryou/AML_RNAseq/IndexedGenomeRNA/rsem/GRCh38 3 /home/saryou/new/sra/rna/Patient_AML_003/SRR3088034 /home/saryou/new/sra/rna/Patient_AML_003/SRR3088034.temp/SRR3088034 /home/saryou/new/sra/rna/Patient_AML_003/SRR3088034.stat/SRR3088034 -p 10 --gibbs-out" failed! Plase check if you provide correct parameters/"
As I said before the Alignment step went smoothly so I dont know why it is complaining about Aligned.toTranscriptome.out.bam . However,I assume that these are just warnings and won’t hinder RSEM to generate the .genes.results and .isophorm.results
When I navigated to the output directory I found those two folder created by RSEM “output name used”.stat and “output name used”. temp but I could find either files .genes.results and .isophorm.results anywhere
I don't know why it didn't output the quantification files. Has anyone had similar problems?
I am using RSEM 1.3.0 and STAR 2.5.3 for alignment