What's the difference between library diversity and unique molecules? Can I use unique molecules instead of library diversity
What's the difference between library diversity and unique molecules? Can I use unique molecules instead of library diversity
If this question is in reference to sequence then those two terms could be used interchangeably. If sequence of the library fragments is identical from both ends then one can consider those fragments to be identical (though may not be since you likely will not sequence them entirely if inserts are 400-500 bp).
Most library prep methods include amplification. Unless you incorporate unique molecular indexes (UMI) it would be difficult to tell if the fragments were unique to begin with. If the library underwent an excessive amplification then it will have fewer unique molecules based on observed sequence.
Probably not. Library diversity is generally used to describe the actual per-base nucleotide diversity. More unique molecules will likely increase the diversity but that's not necessarily true.
Imagine if you are sequencing a PCR amplicon where the first 15 nt are the same for each molecule and then the rest is completely random:
GCTGGAGTCCTGAGGAATTAAATCATTATAC
GCTGGAGTCCTGAGGTCGTTCGCGTAACGC
GCTGGAGTCCTGAGGGGAGCAAACCTCAGT
So imagine you have one unique million molecules with the same 15 nt pattern at the start of each (as shown above).
Your library diversity (some say sequence diversity) is low because the the sequence is identical at the start and this can cause very serious problems for certain types of sequencing (e.g. Illumina). The diversity is generally measured in this sense on a per-base level.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I have two different libaries which were sequenced, How should I compare their library diversity?
FASTQC is the simplest