Hi there,
I am trying to find out which is it the best sequencing depth for my mRNA-seq experiment. I have a couple of mouse samples (isolated cells and tissues) and I would like to perform mRNA-seq and get some sort of reliable quantitative value (RPKM) for each gene (including isoforms). This is going to be on an Illumina system (2x100bp), as this is what is available on campus. As I do not have any experience with mRNA-seq I have asked around and I got quite different suggestions. From one side I have been suggested to get 20-50M sequenced fragments/sample, as this would be enough, on the other side I got 65 to 100M sequenced fragments/sample.
So, my question, who is right? Can someone suggest the optimal depth?
Thank you
See if this helps: https://genohub.com/rna-seq-library-preparation/
Hi genomax2, thanks! As far as I understood from you link, I should do 50-100M reads (25-50 fragments for a paired end?!?), Transcriptome Sequencing - Alternative splicing
Depends on what your final aim is. If you are looking for rare events/alt splicing then going PE/deeper would be needed. This thread has more information that should help you decide: Varying Sample Size vs Read Depth, Read Length, and Single vs Paired end to optimize DE analysis of RNA-Seq
Thanks, your suggestions were very useful. I am now oriented towards doing PE/30M fragments (60M reads) per sample. I am not interested in DE analysis, as I am not going to compare directly each sample, I am more interested in knowing how much of each gene (and gene-isoforms) is in each sample.
Going a bit deeper (PE/70M fragments (140M reads)) would cost about 4-5 times more, and if this is not necessary (at least this is what I understood), than I could save some money, which is always good :)
Will there be a reference sample? If so how is this different than DE? If you are not planning to use a reference, things could get tricky since you will still need to normalize the data (amount of starting RNA, number of reads etc) before you do the comparisons.
Hi,
What I meant is that I would like to know how much of gene X, Y, and Z is present in sample A (relative amount), I will compare this information then to how much of protein X, Y, and Z is present in sample A (again relative protein abundance). Same for sample B, C, D.... I am not going to compare abundance of gene X in sample A vs sample B.