Hello,
So, I try to use salmon to quantify the read count from my fastq. I use salmon quant for this method. My question is how can I determine the library type? I know salmon provide automatic lib type detection, but I just want to make sure to choose the right lib type.
From the GEO dataset, the extraction protocol is like this:
RNA was extracted from all samples using the AllPrep DNA/RNA FFPE kit from Qiagen and suspended in nuclease free water (RNA) or AE buffer (DNA). Sample concentrations were measured using NanoDrop 1000 (Thermo Fisher Scientific), and RNA integrity numbers were obtained on the 2100 Bioanalyzer (Agilent Technologies) according to the manufacturer’s protocol Approximately 30 ng of RNA was used based on sample concentrations obtained from the Qubit HS RNA assay (Thermo Fisher Scientific). All samples were reverse-transcribed to generate cDNA libraries using the Ion AmpliSeq Transcriptome Human Gene Expression Kit (Thermo Fisher Scientific) adhering to the manufacturer’s protocol for FFPE samples with 16 cycles of target amplification. Library concentrations were determined by qPCR using an absolute quantitation method and the Ion Library TaqMan Quantitation Kit (Thermo Fisher Scientific) following the manufacturer’s protocol. Template reactions were carried out using the Ion PI Hi-Q OT2 200 Kit (Thermo Fisher Scientific) according to the manufacturer’s instructions and then loaded onto Ion PI chips v3 using the Ion PI Hi-Q Sequencing 200 Kit based on the manufacturer’s protocol (Thermo Fisher Scientific).
It seems the cDNA libraries is generated using Ion AmpliSeq Transcriptome Human Gene Expression Kit. I tried to read the user guide but can not find how the library is prepared. As for the samples from the GSE102511, most of the salmon detection result is U (unstranded) but some samples are detected as SF (Single-end Forward strand).
They use the same protocol it should be the same for all samples I think.
Anyone familiar with the protocol here can suggest libtype parameter for Salmon? If not I will just go with automatic.
On top of that, how much it will affect the result of read counting if I use wrong lib type for Salmon or other slignment software?
Thank you.
From just reading the kit specifications, I did not see any information on strandedness. From the library information on NCBI, it is single-end sequencing, so I would go for the unstranded setting. Run it and see if the mapping rate is adequate.
the automatic result has quite good mapping rate and most of the sample is detected as unstranded. so I think I will ust set it to automatic then.