Hi all,
I am currently performing DTU analysis following salmon quantification. I started with converting BAM files, which were aligned to the reference genome via STAR, back to fastq files (paired-end) with bam2fq since I do not have the original fastq files at hand right now. When running salmon with these files (automatic LIB detection), I get the following warning:
"Detected a potential strand bias > 1% in an unstranded protocol check the file SALMON/lib_format_counts.json for details"
According to the corresponding file, the library is stranded:
{
"read_files": "[ sample_r1.fq, sample_r2.fq]",
"expected_format": "ISR",
"compatible_fragment_ratio": 1.0,
"num_compatible_fragments": 22131862,
"num_assigned_fragments": 22131862,
"num_frags_with_concordant_consistent_mappings": 19984860,
"num_frags_with_inconsistent_or_orphan_mappings": 2245202,
"strand_mapping_bias": 0.000004803613277907642,
"MSF": 0,
"OSF": 0,
"ISF": 96,
"MSR": 0,
"OSR": 0,
"ISR": 19984860,
"SF": 1069736,
"SR": 1175370,
"MU": 0,
"OU": 0,
"IU": 0,
"U": 0}
Also, when coloring reads in IGV according to "first-in-pair read strand" it seems that the library is stranded as almost all reads have the same color. Since I double-checked the new fastq files and the strandness already appears in the BAM file, I am pretty sure that the conversion to fastq went well.
However, we did not perform stranded RNA-seq library protocol and I do not understand why I got these results. Does anyone of you have any idea why this might be the case? And do you think this fact has an influence on the further analysis (e.g. DRIMseq) or can I still use the quantification?
I really appreciate any help you can provide.
Which kit did you use for the library prep?
Unfortunately, I do not know the exact kit right now since the data was generated before I came to the lab. Hope to get an answer soon from the wet-lab people. Are there any kits that tend to show this behavior?
Besides, is there any chance that this comes from the initial alignment? But I used pretty much the ENCODE-recommended settings with STAR and I think this should be fine.
Were the files collated/name sorted before the conversion (or does
bam2fq
manage that internally).Yes, I forgot to mention this. I used
samtools sort -n
bevor the conversion.