I run ClinCNV ( https://github.com/imgag/ClinCNV ) to detect the copy number in WGS bam file. I shell get two files, bed and COV.
I have different types of bed file, hg38.all.bed, hg38.bed and preparedBedHg38.bin100.bed. The question is that which file is used to make COV file by BedCoverage (BedCoverage -bam $bamPath -in $bedPath -min_mapq 3 -out $sampleName".cov")?
BedCoverage is to extract the average coverage for input regions from one or several BAM/CRAM file(s)
Thank you for developing this tool for the CNVs analysis. I tried to use it in our lab using 40 samples, however, I keep getting this error in the final step when running [ Rscript clinCNV.R ......]
The Error as below:
[1] "We run script located in folder /home/bioinformatics/ClinCNV . All the paths will be calculated realtive to this one. If everything crashes, please, check the correctness of this path first."
[1] "START cluster allocation."
[1] "Cluster allocated."
[1] "END cluster allocation."
[1] "We are started with reading the coverage files and bed files 2022-03-23 10:57:53"
[1] "ERROR: your file with normal coverages have different amount of rows with bed file or coordinates are not matching. It is most probably a technical mistake. Check the input. List of regions not presented:"
chr.X start end gc genes
1 chr1 65509 65625 0.33 0
chr.X start end gc genes
2 chr1 65831 65973 0.42 0
chr.X start end gc genes
3 chr1 69481 69600 0.51 0
......
......
......
I notice that for the your sample bed file and .cov file both have the same ranges. But in my case it is not the same.
Should I edit my bed and remove the extra ranges which is not included in my final coverage file? or should I edit the coverage file?
Please your help in this is appreciated.
PS: I'm using ClinCNV for WES germline CNV detection.
How can I generate .cov file with extra regions same as in my .bed file? If I understand correctly, I should add all missing regions to my merged .cov file and put '0' value for their coverage?
After running [Rscript clinCNV.R ......] I get the Error as below:
"ERROR: your file with normal coverages have different amount of rows with bed file or coordinates are not matching. It is most probably a technical mistake. Check the input. List of regions not presented:"
chr.X start end gc genes
chr1 721430 721906 0.4 0
But when I checked my bed file and the merged coverage file:
It is a different region. So what I normally do: I take .bed file and calculate .cov files for that .bed file. Then you have regions matching between .cov and .bed. I would do this if I were you :)
Hi, if you calculate coverage using some BED file with regions A,B,C, it should give you coverage in regions A,B,C. How could it happen that for some regions the coverage is missing?
Hi Ishak, you should segment your reference genome into pieces of more/less uniform length and give this bed file as input for BedCoverage.
bed file that you need should contain 3 columns, separated by tabs: chr, start, end.
bed file that is required for the further steps should also contain GC annotation as the 4th column and optionally genes as the 5th.
I'd recommend to use windows around 1.000bp of size. 100bp may take too much space on your server.
Thanks a lot for your reply