Question

Salmon libtype guessing and mapping bias

0

Entering edit mode

2.3 years ago

lhatschi ▴ 10

Hi everyone,

so I am currently trying to reanalyze an existing bulk RNAseq dataset (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE136731). I downloaded the FASTq files and trimmed them using trimmomatic. Before using HISAT2, i wanted to check for strandedness. So i used salmon on 1 paired sample using -l A method

salmon quant -i iHomo_sapiens_salmon -l A -1 SRR10056377_cDC2_CD5Control_trimmed_Seq_1.fastq -2 SRR10056377_cDC2_CD5Control_trimmed_Seq2.fastq --skipQuant -o Transcriptquant_SRR10056400

My output json file said this:

"read_files": "[ SRR10056400_cDC2_CD5-CD163_CD14-Control_trimmed_Seq_1.fastq, SRR10056400_cDC2_CD5-CD163_CD14-Control_trimmed_Seq2.fastq]",
"expected_format": "IU",
"compatible_fragment_ratio": 1.0,
"num_compatible_fragments": 8936467,
"num_assigned_fragments": 8936467,
"num_frags_with_concordant_consistent_mappings": 7528619,
"num_frags_with_inconsistent_or_orphan_mappings": 1532162,
"strand_mapping_bias": 0.5011540629164526,
"MSF": 0,
"OSF": 0,
"ISF": 3772998,
"MSR": 0,
"OSR": 0,
"ISR": 3755621,
"SF": 905506,
"SR": 626656,
"MU": 0,
"OU": 0,
"IU": 0,
"U": 0

As I am a wetlab biologist and not too familiar with computational biology, my question here is:

It looks like i have two types ISF and ISR, however the guess is IU as library type (so ustranded). Is it because I got ISR AND ISF? The strand_mapping_bias also does not look too good. Any suggestions or help on how to process further concerning the HISAT2 alignment? BEcause there i have to specify the strandness.

Thanks in advance!

type Salmon library • 1.1k views

ADD COMMENT • link 2.3 years ago by lhatschi ▴ 10

score 2 · Answer 1 · 2022-09-17

2

Entering edit mode

2.3 years ago

Rob 6.9k

Your library is definitely unstranded. An individual alignment must itself be either forward or reverse with respect to its first read, thus an individual alignment isn't "unstranded". Rather, that's a property of the collection of alignments for all reads.

Here, the bias refers to the fraction of fragments whose first read maps to the forward strand, and it's very close to 50% (perfectly unbiased). So, you have an unstranded library with approximately an equal number of reads mapping in each orientation.