Hello,
I am going to analyse structrual variants in genomic data. Can anyone tell me how can i divide genome into given window size and how can i count average numbers of reads in that window. I reads about this and i found bam2window.pl script but i do not know how its work because it requires test and sample data but i have reference and 3 sample data so i do not understand how it will work.
Can someone suggest me how can i do this analysis.
Thanks in advance
Thank you so much for your reply.Ok, i will look upon cnvkit and FACETS. Can you just tell me that i count the reads in given window but i do not know i am right or wrong:
is this right ???
And why we do GC correction of align reads in given window ?? can you explain it ??
Just comment regrading dividing your genome: I do not see that it is a good idea it is much better to use already tested tools that identify CNV as dariober said.
regarding the GC normalization I refer you to A: Why GC-content normalization for CNV analysis?
as sequence coverage on the Illumina Genome Analyzer platform is influenced by GC content refer to https://www.ncbi.nlm.nih.gov/pubmed/12915456 and http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0062856
Thank you so much for useful advice. I am going through cnvkit tool but i am confused because i do not have experiment and control sample. In cnvkit, they used normal and tumor samples but i have 3 WGS data and reference sequence then how can i utilized cnvkit. Please suggest me.