Hi all,
I have depth of coverage (read counts at each base ) from 8 samples, and i am interested to compare them for finding the regions with differences in the depth (CNVs, deletions etc). I have first normalized these samples and now want to apply some method to compare and see the significant differences. I am using perl script. The problem is I am new to statistical analysis and need little guidance. I know some softwares can do the same job like CNVseq (but they are using the statistics that i donot understand :() What are your suggestions, what is the most simple way to compare these ? Could fold change be applied? I applied z-normalization, is it applicable? Any literature, scripts, and suggestions would work. (Please note i am new to statistical analysis)
EDIT: Please note the data is from resequencing of whole genome and not RNAseq. The samples represents different dog populations.
I would suggest learning the CNVseq algorithm. Even if you want to write your own version, you need to understand the existing works. This is science.
Thanks lh3, I will learn CNVSeq to see how it works, but for my current question it might not work, as i already have information on deletions and inversions using split read and paired end. My interest is to see the depth at these regions by comparing the samples mentioned above.
I have only looked at the homepage and the first manual page of CNVseq. I could be wrong, but... doesn't CNVseq do exactly what you want: comparing the read depth like what you do with arrayCGH?
I completely agree with you. I think i misinterpreted the information due to lack of my knowledge.
Could you elaborate more on the data that was sequenced ? Is it genome-seq ? I hope it's not RNA-seq.
Its re-sequencing data (genomic) using illumina paired end.
Are these tumor samples? Do you have matched controls for these samples? Is this DNA/RNA seq, if it is DNA-seq, is it captured or whole genome, shallow seq or deep seq?
sorry, for my poor explanation of data. Its DNA sequence data.