I've been running MageckRRA and providing a list of negative control sgRNAs using --norm-method control
and --control-sgrna negative_ctrls.txt
In my gene_summary
file I see these negative controls appearing as the top high and low fold change genes. For one of the sets of samples I ran RRA on the log file includes a lot of 'Skipping gene ... for permutation ...' messages, but for another set of samples it apparently didn't do this.
My colleague said the negative controls should not appear in the gene summary. Are they correct? Is there something I need to change to get the negative controls out of the results? Is this likely to be a code issue or an issue of poor quality data?
Thank you!
Please provide code and output examples, textual descriptions are hard to debug. IIRC when I used MAGeCK RRA for shRNA screening it did not return controls in the output. But I wanted to also see the stats for the negative controls, so I duplicated negative controls and added a "_1" to the duplicates. The original names I provided to the normalization
--norm-method control --control-sgrna
parameters, so normalization and permutation distribution where based on them. Still the duplicated ones would come back in the output so I could double-check that most controls were not called as significant. But yes, "normally" they should not be in results.Sorry! Will include code next time!
Fyi, my current solution is to stop using RRA and just also use MLE on direct sample to sample comparisons.