Library size normalization during CNV calling from genomically doubled tumor tissue
1
0
Entering edit mode
5.3 years ago
CY ▴ 750

I realized library size may be an issue and most CNV tools seem to ignore this.

Say, we try to call CNV out of a tumor tissue those genome is almost doubled. If we directly compare the depth of each bin between tumor and normal control and the library size of tumor (fastq size) and normal tissue is equal, the depth of genomically doubled tumor tissue will still have the same depth as the normal tissue. If CNV caller ignore this library size issue, CNV called from genome doubled tissue will be incorrect (estimated diploid baseline is actually double genome). Can anyone share some insight on this? Thanks

CNV • 1.8k views
ADD COMMENT
0
Entering edit mode

The library preparation and quantification that determines the library size is normally done based on the amount of DNA, not the number of cells.

ADD REPLY
0
Entering edit mode

Exactly. If both tumor and normal tissue require same DNA amount during library prep and output roughly same size of fastq, DNA molecule in genomically doubled tumor tissue is "diluted" by requiring the same DNA amount. The genomically doubled region of tumor tissue will have the same depth as in normal tissue and CNV will not be called. Am I right?

ADD REPLY
0
Entering edit mode

A perfectly doubled genome couldn't be distinguished from normal, but chromosomal instability by definition results in many gains and losses. In simple terms, algorithms designed for this problem see that it cannot be a copy number of 2 when there are extensive losses that would correspond to copy numbers 1, 0, -1, -2.

ADD REPLY
0
Entering edit mode

See what you mean, but I think what CY means with library size is simply total sequencing read coverage. But maybe I was interpolating too aggressively.

ADD REPLY
0
Entering edit mode

Exactly, by library size I mean the total sequencing depth or fastq size.

ADD REPLY
0
Entering edit mode
5.3 years ago

Every purity and ploidy aware copy number caller takes this into account. Have a look at the ASCAT or ABSOLUTE paper.

ADD COMMENT
0
Entering edit mode

Yes, they estimate ploidy, but it is based on allele frequencies, not library size.

ADD REPLY
0
Entering edit mode

? Allele frequencies are used in these algorithms to eliminate wrong purity and ploidy combinations.

ADD REPLY
0
Entering edit mode

I just wanted to clarify that library size is not used, which is what the original question was interested in. I was not disagreeing with anything that was stated.

ADD REPLY

Login before adding your answer.

Traffic: 2642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6