Hi,
Can anyone provide a paper evaluating how lowly expressed genes can be detected by PacBio RNA-seq without normalization? The normalization is reducing the highly expressed genes, in order to increase the diversity of detected transcripts [1]. However, to my knowledge, the normalization is not a strait-forward work. Many companies refuse providing the full-length RNA normalization service, because the effect of the normalization is not easy to evaluating.
Indeed, there should be a correlation between the detect power of the PacBio on lowly expressed transcripts and the sequencing depth.
Thank you!
[1] http://www.pacb.com/blog/tutorial-on-iso-seq-method-applications/
I performed Isoseq on a cell line without any type of target enrichment and 2 SMRTcells yielded 550k high quality full length non chimeric reads and 908k low quality non-full length reads. More of these low quality reads are converted to high quality with greater sequencing depth. I got into contact with bioinformatics support on their end and she told me the only way to generate more of the HQ reads was to sequence more. The isoseq pipeline detects the best polymerase read length to use for CCS calculations and optimizes this step in the pipeline, so that when it tries to classify reads as High Quality, it has already generated the "best" CCS reads possible. In order to get a high quality read, the full length transcript must be covered in its entirety by CCS reads (not raw reads), so if you have a low expression transcript this region must not only be covered by raw reads, but it also must be able to produce accurate CCS reads of the transcript. So your conclusion is correct, there is a correlation between the detection power of PacBio on lowly expressed transcripts, and sequencing depth. All this being said, if you are looking for lowly expressed transcripts, I would not recommend doing so without first doing a targeted enrichment strategy, as you may fork out 4 SMRTcells on a cell line and still miss what you are looking for.
What do you mean with "without normalization"? Normalization shouldn't have an effect on the ability of a technology to detect something. In addition, it will depend on how deeply you sequence the library (how many SMRT cells I believe in PacBio sequencing), and how abundant the gene really is.
@WouterDeCoster, Thanks for your reply, I have re-edited my question and provided more information about the normalization and the power of the detection.
Normalization will influence the reported abundance (quantity measurement) of a transcript but not the ability to detect the presence (qualitative measurement).