Hi Biostars,
I have run Salmon software for RNAseq data analysis and got the warning message
Greater than 5% of the fragments disagreed with the provided library type; check the file: Pool9_21565_GTTTCG_quant/lib_format_counts.json for details
lib_format_counts.json
file looks like
"read_files": "/home/ghovhannisyan/users/tg/hhovhannisyan/col_cancer/raw_data
/Pool3_21559_ACTGAT.fastq.gz",
"expected_format": "SR",
"compatible_fragment_ratio": 0.8350394020362054,
"num_compatible_fragments": 7447997,
"num_assigned_fragments": 8919336,
"num_consistent_mappings": 12428090,
"num_inconsistent_mappings": 2455593,
"MSF": 0,
"OSF": 0,
"ISF": 0,
"MSR": 0,
"OSR": 0,
"ISR": 0,
"SF": 2455593,
"SR": 12428090,
"MU": 0,
"OU": 0,
"IU": 0,
"U": 0
I know for sure that library preparation protocol was TruSeq stranded (SR
option in Salmon). I am curious what can be the interpretation of the result I got? Does it point out to some levels of antisense transcription?
Thanks
What organism? I have the impression for badly annotated genomes (which means any genome except for human and mouse?) there are some / a lot of unannotated overlapping features which end up being ascribed to the wrong feature. Now, if you are working with human or mouse, them I am even more clueless.