How to draw heatmap plot use cnv seg data called by gatk?
1
0
Entering edit mode
5.5 years ago
MatthewP ★ 1.4k

Hello, everyone. I follow this tutorial to call CNV for my WES data. After command ModelSegments we can get seg data and plot use PlotModeledSegments , this give CNV plot for each sample. I have many samples(100+) and want to generate a heatmap to see what's differentce(/common) between samples.
I search a lot and find R package copynumber may useful. However copynumber needs data like:

chrom arm start.pos end.pos n.probes X01.B1 X01.B2 X01.B3
1 1 p 1082138 64194749 70 -0.0455 -0.0336 -0.0376
2 1 p 65355304 119515493 58 0.0450 0.0251 -0.0272
3 1 q 142174575 146617392 8 0.0120 0.0495 -0.0317
4 1 q 146756663 245340016 129 0.4038 0.0263 -0.0091
5 2 p 314759 89830600 107 0.0026 0.0004 0.0175
6 2 q 94941109 242568229 159 0.0063 0.0111 0.0061

This means different samples will have same segments(eg. chr1:1082138-64194749). For ModelSegments output different samples will have different segments, I will paste a few lines for 2 samples:

# sample 1
CONTIG  START   END     NUM_POINTS_COPY_RATIO   MEAN_LOG2_COPY_RATIO
chrM    3026    16198   6       1.416845
chr1    68790   55353077        6006    -0.039258
chr1    55446392        150248534       4058    0.089874
chr1    150248631       152552718       472     -0.129682
chr1    152572906       152573795       1       -25.813421
chr1    152595000       156811387       1004    -0.087822
chr1    156811388       249212846       6571    0.104009

# sample 2
CONTIG  START   END     NUM_POINTS_COPY_RATIO   MEAN_LOG2_COPY_RATIO
chrM    3026    16198   6       1.511008
chr1    68790   47800088        5398    -0.043764
chr1    47823618        152552718       5138    0.031720
chr1    152572906       152595889       2       -13.170018
chr1    152636274       158585499       1223    -0.057954
chr1    158586998       249212846       6351    0.046857

Anyone has idea about how to modify my data to fit copynumber ? It's any other way to draw such heatmap plot use R? Thanks very much.
PS: use cnvkit pipeline can generate heatmap, but I want to compare this two pipeline.

cnv gatk heatmap • 3.1k views
ADD COMMENT
0
Entering edit mode

I think you have to write some codes to manually convert these two formats...

ADD REPLY
2
Entering edit mode
5.5 years ago

You can ignore copynumber. Here is what you should do:

  1. create a list of 'consensus regions'. This means to create a unique list of every region that is identified in every sample
  2. use the GenomicRanges package to determine the copy number in each 'consensus region' for each sample

The heatmap will then be generated from the output of step 2.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6