I have run the mapping with STAR for a PE data with the following code:
STAR --runThreadN 20 \
--genomeDir /genome_index150 \
--readFilesCommand zcat \
--readFilesIn "$file" "$file2" \
--outFileNamePrefix "$OUTPUT_DIR$base_name" \
--outSAMtype BAM SortedByCoordinate \
--sjdbOverhang 149 \
--alignIntronMax 500 \
--outReadsUnmapped Fastx \
--outFilterMismatchNoverLmax 0.05 \
--quantMode GeneCounts TranscriptomeSAM GeneCounts
I get the following number of overlapping genes (blue) when I also run the --quantMode GeneCounts (as indicated before):
However, when I run RSEM for counting using the transcriptome BAM files from STAR I get a total number of mapped reads that correspond aprox. to half the uniquely mapped reads with STAR default htseq-count.
This is the code used for RSEM:
rsem-calculate-expression --bam --no-bam-output --paired-end --forward-prob 0 --estimate-rspd --append-names\
"$bam_file" \
"$rsem_index" \
"$sample_output_dir/$base_name_trimmed" \
-p "$num_threads"
theoretically the RSEM output should be equivalent to the uniquely overlapping reads from htseeq-count, right? What is happing while counting with STAR geneCounts parameter?