Hi, I wonder what should I do to obtain a dataset to draw a plot showing GC content against mean depth for exome data (like this). I have bam files, bed files, fastQ files but I don't even know where I should start.
Thanks
Hi, I wonder what should I do to obtain a dataset to draw a plot showing GC content against mean depth for exome data (like this). I have bam files, bed files, fastQ files but I don't even know where I should start.
Thanks
So I have done the following :
Using bedtool nuc
I have calculated the GC content of each region of my bed file
Using samtools bedcov, I have calculated the average depth of coverage for each region of my bed :
samtools bedcov -Q 30 Intervals.bed sample.bam | sort -k1,1 -k2,2n -k3,3n | awk 'BEGIN {OFS="\t"}{a=($3-$2+1);b=($5/a);print $1,$2,$3,$4,b,"+","+"}' > sample_coverage.tsv
Using bedtools intersect or a simple join you can have the data to draw a plot depth ov coverage Versus GC content (%)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
see Plot Coverage Vs. Gc Content for a start
Is CollectGcBiasMetrics (Picard) what I am looking for ?