Question

How I should set `STAR` parameters to keep all the intronic reads?

0

Entering edit mode

8 months ago

Dan ▴ 180

Hello

I am using STAR to map the reads of bulk RNA-seq data. My sample should contain some pre-mRNA which contains intronic regions. I checked the bam file of the STAR output in IGV, I found there were no intronic reads in most of the genes. I am wondering How I should set STAR parameters to keep all the intronic reads? I generated the reference using:

STAR   --runMode genomeGenerate   --runThreadN 16   --genomeDir ./   \
    --genomeFastaFiles  \
    /Users/Shared/reference/FASTA/hg20/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
    --sjdbGTFfile /Users/Shared/reference/gtf/Homo_sapiens.GRCh38.90.gtf --sjdbOverhang 99

Is my parameter setting correct for mapping the intron region?

STAR  --genomeDir /Users/Shared/reference/STAR_Homo_sapiens.GRCh38.90   \
    --readFilesCommand gunzip -c \
    --outFilterMultimapNmax 5 \
    --readFilesIn $1 $2  \
    --outFileNamePrefix STAR_one_pass_out/$3"_hg20_"  \
    --runThreadN 20 \
    --outSAMstrandField intronMotif \
    --outSAMunmapped Within \
    --outFilterIntronMotifs RemoveNoncanonicalUnannotated \
    --outSAMattributes All \
    --outSAMtype BAM Unsorted \
    --quantMode GeneCounts \
    --limitSjdbInsertNsj 2000000 \
    --outStd BAM_Unsorted  | sambamba sort -t 6 -m 10G -o STAR_one_pass_out/$3"_hg20_Aligned.sortedByCoord.out.bam" /dev/stdin

sambamba index -p -t 6 STAR_one_pass_out/$3"_hg20_Aligned.sortedByCoord.out.bam"

Thanks

STAR mapping • 681 views

ADD COMMENT • link updated 8 months ago by dsull ★ 6.9k • written 8 months ago by Dan ▴ 180

score 4 · Accepted Answer · 2024-02-28

4

Entering edit mode

8 months ago

dsull ★ 6.9k

There isn't a way to do it in STAR quantMode GeneCounts.

Try using htseq-count. You can specify the featuretype (e.g. set it to gene if you want to quantify anything that falls within the gene body: exon or intron).

ADD COMMENT • link 8 months ago by dsull ★ 6.9k

2

Entering edit mode

As a side note, if you use STARsolo for single-cell sequencing, there is a GeneFull mode that does this.

ADD REPLY • link 8 months ago by dsull ★ 6.9k