These are two 454 runs for the same source of genomic dna. I am about to decide which set of sequences to use. Therefore, I would like to ask, has one of these runs better quality (based on judging length distribution) or not? Thanks a lot for the answers from more experienced users of 454 data.
If these two runs come from the same library, and differ only by the run, then I would consider the first run to be "better" because of the narrower peak around 500bp.
If these are runs from different libraries made from the same genomic data, then I would have a look at the differences in the library protocols (if any -- differences) and in the 454 machine settings, and definitely to the QC report from the 454 machine.
Aside from that, I believe the definition of a "better" run should be based on average read quality distribution rather than on read length distribution.
Because a 454 machine runs with just enough reagents to sequence a given length, so if everything else is the same (and you should carefully double check this), I would expect the broader peak to come from slight problems in the 454 reaction process. This is why, I would specifically check if the 454 machine settings were exactly the same.
I think one cannot decide about better quality based on lengths alone, if lengths are comparable (I say comparable because if you have 1 graph where most of the sequences are 50-100bp long and 2nd graph shows you, say, 400-500bp then obviously something is wrong). In this case both the graphs (just a note- y axis scale is different) show same distribution. You can download both the files and run QC and then you can decide or you can read the paper (or any other source from where you took the data) that why they run same sample 2 times and Did they change anything during the 2nd run (library prep. etc) and what were there results (in terms of which run is better).
Could you please explain to me, why do you consider narrower peak better? (in case everything else is same:) Thanks.
Because a 454 machine runs with just enough reagents to sequence a given length, so if everything else is the same (and you should carefully double check this), I would expect the broader peak to come from slight problems in the 454 reaction process. This is why, I would specifically check if the 454 machine settings were exactly the same.
Thanks for explanation.