Hi I have a question, I have Drosophila genome spike-in in a chip-seq experiment. Does anybody have experience on the best way to normalised and process those samples to generate bw and count matrix? ( I am using deeptools)
I am following the instruction of the vendor:
map reads to the sample genome (human, mouse, etc.) map reads to the DM genome count unique DM sequence tags. Identify sample with least amount of tags Count DM tags from other samples. Create a normalization factor = lowest_sample/sample_of_interest
then I use the normalisation factor ${SPIKE} in deeptools as follow:
bamCoverage -b ${SAMPLE}/${SAMPLE}.nodup.norm.bam -o /scratch/${PBS_JOBID}/${SAMPLE}/${SAMPLE}.bw \ -p 1 --normalizeTo1x ${genomesize} \ --scaleFactor ${SPIKE} \ --ignoreForNormalization chrX chrY \ -bl ${ENCODEBED} \ --smoothLength 40 \ --binSize 10 \ --centerReads --extendReads 150
Since I am getting a modest increase, and the vendor results are a modest decrease of signal I am wondering if I am doing something wrong with bamCoverage --scaleFactor argument or if the discrepancy is due to the different processing of the bw between my pipeline (deeptools based) and the vendor one (custom made).
Thanks a lot for your support and advice
I actually got a question back for you: what vendor did you obtain the Drosphilla genome spike-in. I am looking for it myself as well. thanks.
Active-Motiv.
You can also try your self if you have access to S2 drosophila. Just crosslink the drosophila independently. Mix with your target cells at ratio of 1:10 ( we do 1:100). And then process to sonication and all downstream chip-Seq as usual.
Aligns and deduplicate the target genome and Drosophila genome.
Use the number of drosophila unique mapped reads (not mapping to target genome.) for normalization.