How to plot alignment statistics; how many reads are mapped to the genome?
1
0
Entering edit mode
3.8 years ago
anamaria ▴ 220

Hello,

I am doing RNA-seq analysis. I will have these steps performed:

hisat2 -p 12 --new-summary --summary-file $OUTPUT.hisat2.summary -x $REF -1 $R1  -2 $R2 -S $OUTPUT.sam

#1-Convert sam to bam
 samtools view -bS -o $OUTPUT.bam $OUTPUT.sam

# 2- Sort bam file
 samtools sort $OUTPUT.bam  -o $OUTPUT.sorted.bam

# 3- Generate index for bam file
 samtools index $OUTPUT.sorted.bam

I know that I get number of mapped and unmapped reads with:

samtools view  -b -f 2 $OUTPUT.bam > mapped.bam
samtools view  -b -F 2 $OUTPUT.bam > unmapped.bam

Can someone please recommend me a code to make a plot like attached?

enter image description here

samtools RNA-Seq • 1.6k views
ADD COMMENT
2
Entering edit mode
3.8 years ago

multiqc will automatically generate reports for hisat, and a bunch of other software such as fastqc.

ADD COMMENT
0
Entering edit mode

Thank you so much! So basically if I use hista2 with --new-summary flag I will get summary stats that I can use with MultiQC to generate plots? Do you have any tutorial on how MultiQC is exactly used for that purpose?

Or all I need to run is: multiqc .

and it will generate teh output from whatever it find in the current directory? Please advise

ADD REPLY
1
Entering edit mode

It will look through all the files and directories contained within the directory you specify for compatible results/reports.

So for example if you have a project directory that has a directory with your fastqc results and another directory with your hisat2 results, if you specify that project directory it will generate a report that includes the fastqc results and hisat2 results.

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6