From my limited experience with genomic DNA NGS, I could calculate how many "genomic samples" (be it WES (~50Mb), WGS (~3Gb), or targeted sequencing) to pool for one lane in Hiseq2500, how many sequencing lanes for one project (e.g. calculating at ~ 50-80 Gb data per lane) and use the info to estimate the total cost of the experiments. But it appears that this does not translate well into experiments of RNA-seq using NGS. Could anyone provide any pointer regarding (for lack of a better word) "RNA-seq statistics" to help me estimate the cost of an RNA-seq experiment? For example,
(a) in one Hiseq2500 sequencing lane, what's the total number of reads I could get for a 2 ug mRNA with a standard cDNA PCR cycle (5' reverse transcription + 3' PCR extension)?
(b) Nature Methods 6, 377 - 382 (2009) suggested for example that with a modified cDNA PCR cycle (extending reaction time from 5' to 30' for reverse transcription and from 3' to 6' for PCR extension), the authors got 100 million 50-base reads, which means ~ 5 Gb. If we assume 50-80 Gb data per lane in Illumina Hiseq2500, does this mean with the protocol from Nature Methods people could pool 10 different samples in one lane ?
(c) what's the average transcript reads for a sample of 2 ug mRNA as starting materials?
(d) from UCSC, hg19 statistics report a total of 34702 genes. On average, how many of these 34702 genes get transcribed ? what would be the average number of transcript per gene?
Any info on any of these and any other references on RNA-seq experiment would be greatly appreciated.
Thank you
Please learn to ask one question per post.