Hello, I have gone through some papers but i can not get that why GC content estimation and correction is important for CNV analysis ?? Can anyone explain me ??
Thanks & Regards
Hello, I have gone through some papers but i can not get that why GC content estimation and correction is important for CNV analysis ?? Can anyone explain me ??
Thanks & Regards
Imagine that, due to GC bias, you are twice as likely to sample sequencing reads from a mid-GC region of the genome as a high-GC region of the genome. That is, all other things being equal (e.g. copy number), you will derive twice as many sequencing reads from mid-GC regions as from high-GC regions. This might occur due to, e.g., differential and sequence-dependent amplification during PCR, but there are a number of places where extreme GC content might lead to a lower _a priori_ probability of generating a read. Then, if not accounted for, this effect will confound your analysis of copy number. This is because the sequencing data itself represents a combination of both the true "biological signal"---the copy numbers---and the technical biases (in this case, the GC content of the sequence). While it is not generally possible to perfectly remove all technical effects, correcting for effects like GC content that are known to be significant in some cases, can help tease apart the true biological signal from the technical artifacts.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.