Hi it's my first time to do whole-genome sequencing by Hiseq2000, 2 X 100bp; and I just firstly sequenced 1 lane to test my library is good or not. The one lane sequencing gives me X4 , X 1.7 coverage across whole genome for my two samples respectively, while it's expected to produce X10 coverage.
I'm just wondering what to do next when finding the sequencing productivity is quite low? It's the problem of my library itself? (Say inhibitor, the library concentration is pretty good, though), or the sequencer efficiency? Should I continue to sequence to higher coverage?
Hope anyone can introduce a little bit experience. Thanks
I think you'll need to clarify + provide more info if you need hellp. Does "2 X 100bp" mean you ran a paired end library, or you multiplexed two libraries into one lane? If the former, what does "X4, X 1.7" coverage mean? If the latter, does that mean that one library gave you 4x coverage and the other gave you 1.7x coverage? You didn't say how many reads you got back in your FASTQ file from the sequencer. Did you run something like FastQC to see the overall quality of the library? What percent of your reads aligned? Can you detect any bias in your coverage? ie. are there "hills and vallies" in your coverage that you can't account for by mapability?
Not convinced this is a bioinformatics problem; sounds more like problems with sequencing protocol?
I agree it's not necessarily a bioinfo question, but:
There are people on SeqAnswers who will be able to answer much better than I, but one thing you could look for is the reported values for cluster density and "pass-filter" cluster density, which are reported by the HiSeq instrument. If you have a high cluster density pre-filtering, which collapses to a low PF cluster density (equivalently, low % PF clusters), you may have overloaded your lane. In that case you should load using a lower sample concentration.
There are of course many other reasons you could get poor yield, this is just one specific thing to look at.
Can you provide actual numbers: 1. Paired end reads each 100bp long ? 2. How many reads you obtained in your lane? 3. Comment on your library QC? 4. Check sequencing metrics? how are cluster densities (too low or high?) 5. Genome size? \ 6. Fastq read QC (fastQC output) 7. Alignment stats (samtools stats etc?)