Question

Regarding sliding window for CNV analysis

0

Entering edit mode

6.9 years ago

DL ▴ 50

Hello,

I am going to analyse structrual variants in genomic data. Can anyone tell me how can i divide genome into given window size and how can i count average numbers of reads in that window. I reads about this and i found bam2window.pl script but i do not know how its work because it requires test and sample data but i have reference and 3 sample data so i do not understand how it will work.

Can someone suggest me how can i do this analysis.

Thanks in advance

next-gen genome R Assembly sequence • 1.8k views

ADD COMMENT • link updated 6.9 years ago by dariober 15k • written 6.9 years ago by DL ▴ 50

score 1 · Answer 1 · 2017-12-17

1

Entering edit mode

6.9 years ago

dariober 15k

Are you sure you want to roll your own method of analysis? For CNVs, cnvkit is quite popular. Recently, I looked into FACETS with good results.

To answer your question, have look at bedtools suite. Particular, bamCoverage and windowMaker.

ADD COMMENT • link 6.9 years ago by dariober 15k

0

Entering edit mode

Thank you so much for your reply.Ok, i will look upon cnvkit and FACETS. Can you just tell me that i count the reads in given window but i do not know i am right or wrong:

i divide the genome using makewindows command in 10kb. 2.bam2bed < DC_mapped.bam | awk '{if($5>35)print $0}' | bedmap --count /work/dlakhwani/new_mapping/Genome_10000.bed - > countdata.bed

is this right ???

And why we do GC correction of align reads in given window ?? can you explain it ??

ADD REPLY • link 6.9 years ago by DL ▴ 50

0

Entering edit mode

Just comment regrading dividing your genome: I do not see that it is a good idea it is much better to use already tested tools that identify CNV as dariober said.

regarding the GC normalization I refer you to A: Why GC-content normalization for CNV analysis?

as sequence coverage on the Illumina Genome Analyzer platform is influenced by GC content refer to https://www.ncbi.nlm.nih.gov/pubmed/12915456 and http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0062856

ADD REPLY • link 6.9 years ago by Medhat 9.8k

0

Entering edit mode

Thank you so much for useful advice. I am going through cnvkit tool but i am confused because i do not have experiment and control sample. In cnvkit, they used normal and tumor samples but i have 3 WGS data and reference sequence then how can i utilized cnvkit. Please suggest me.

ADD REPLY • link 6.9 years ago by DL ▴ 50