Question

Utilization of STAR Output for RNA-seq Analysis

0

Entering edit mode

17 months ago

wyt1995 ▴ 40

Hello, I am a novice graduate student conducting RNA-seq analysis. The following are the results obtained using the STAR code.

# STAR
STAR \
--readFilesIn run_clean_1.fastq.gz run_clean_2.fastq.gz \
--genomeDir STAR_genomeGenerate \
--readFilesCommand zcat \
--runThreadN 10 \
--twopassMode Basic \
--outFilterMultimapNmax 20 \
--alignSJoverhangMin 8 \
--alignSJDBoverhangMin 1 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverLmax 0.1 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignMatesGapMax 1000000 \
--outFilterType BySJout \
--outFilterScoreMinOverLread 0.33 \
--outFilterMatchNminOverLread 0.33 \
--limitSjdbInsertNsj 1200000 \
--outFileNamePrefix STAR/run \
--outSAMstrandField intronMotif \
--outFilterIntronMotifs None \
--alignSoftClipAtReferenceEnds Yes \
--quantMode TranscriptomeSAM GeneCounts \
--outSAMtype BAM Unsorted \
--outSAMunmapped Within \
--genomeLoad NoSharedMemory \
--chimSegmentMin 15 \
--chimJunctionOverhangMin 15 \
--chimOutType Junctions SeparateSAMold WithinBAM SoftClip \
--chimOutJunctionFormat 1 \
--chimMainSegmentMultNmax 1 \
--outSAMattributes NH HI AS nM NM ch
done

# Output
# run.Aligned.out.bam
# run.Aligned.toTranscriptome.out.bam
# run.Chimeric.out.junction
# run.Chimeric.out.sam
# run.Log.final.out
# run.Log.out
# run.Log.progress.out
# run.ReadsPereGene.out.tab
# run.SJ.out.tab

Among these results, I would like to use DESeq to identify DEGs (differentially expressed genes) and I understand that ReadsPerGene is commonly used for that purpose. I am curious if the remaining files, such as Aligned.out.bam and Aligned.toTranscriptome.out.bam, are not necessary for my analysis.

STAR • 900 views

ADD COMMENT • link updated 17 months ago by biofalconch ★ 1.3k • written 17 months ago by wyt1995 ▴ 40

1

Entering edit mode

Not really, unless you are counting your reads into features (again). The bam files are just the results of mapping your reads to your reference, they have other uses (e.g. visualisation of expression within a gene body, variant calling, etc.) but for differential gene expression analysis is not really necessary :)

ADD REPLY • link 17 months ago by biofalconch ★ 1.3k

0

Entering edit mode

Oh, thanks. If possible, could you provide me with a website or give a brief explanation about the possible analysis of the remaining files? I apologize for being shameless, but I would appreciate your assistance.

ADD REPLY • link 17 months ago by wyt1995 ▴ 40

0

Entering edit mode

IGV is a tools you can use to visualize your reeds: https://software.broadinstitute.org/software/igv/BAM

or maybe something like this: https://expert.cheekyscientist.com/how-to-do-variant-calling-from-rnaseq-ngs-data/

ADD REPLY • link 17 months ago by biofalconch ★ 1.3k