STAR pair-end gene count discrepancy vs. RSEM
0
0
Entering edit mode
18 days ago
cropero • 0

I have run the mapping with STAR for a PE data with the following code:

STAR --runThreadN 20 \ --genomeDir /genome_index150 \ --readFilesCommand zcat \ --readFilesIn "$file" "$file2" \ --outFileNamePrefix "$OUTPUT_DIR$base_name" \ --outSAMtype BAM SortedByCoordinate \ --sjdbOverhang 149 \ --alignIntronMax 500 \ --outReadsUnmapped Fastx \ --outFilterMismatchNoverLmax 0.05 \ --quantMode GeneCounts TranscriptomeSAM GeneCounts

I get the following number of overlapping genes (blue) when I also run the --quantMode GeneCounts (as indicated before):

enter image description here

However, when I run RSEM for counting using the transcriptome BAM files from STAR I get a total number of mapped reads that correspond aprox. to half the uniquely mapped reads with STAR default htseq-count. enter image description here

This is the code used for RSEM:

rsem-calculate-expression --bam --no-bam-output --paired-end --forward-prob 0 --estimate-rspd --append-names\
        "$bam_file" \
        "$rsem_index" \
        "$sample_output_dir/$base_name_trimmed" \
        -p "$num_threads"

theoretically the RSEM output should be equivalent to the uniquely overlapping reads from htseeq-count, right? What is happing while counting with STAR geneCounts parameter?

star genecount rsem htseq illumina • 222 views
ADD COMMENT

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6