I am having some trouble with REditools2 (HPC-Reditools). In the original version of reditools there was a python script for de-novo prediction of editing sites from RNAseq data using a strategy outlined in Picardi et al 2012. I assumed reditools2 would have this same feature. However, after inputting the appropriate files (bam alignments from our RNAseq data) the output seems to be giving me more than predicted editing sites, it seems to be giving a table with all nucleotides with any coverage, and every base that was found there:
Region Position Reference Strand Coverage-q30 MeanQ BaseCount[A,C,G,T] AllSubs Frequency gCoverage-q30 gMeanQ gBaseCount[A,C,G,T] gAllSubs gFrequency
chr1 15210 C 2 1 40.00 [0, 0, 1, 0] CG 1.00 - - - - -
chr1 15211 T 2 1 40.00 [0, 0, 0, 1] - 0.00 - - - - -
chr1 15212 G 2 1 40.00 [0, 1, 0, 0] GC 1.00 - - - - -
chr1 15213 T 2 1 40.00 [0, 0, 0, 1] - 0.00 - - - - -
chr1 15214 T 2 1 40.00 [0, 1, 0, 0] TC 1.00 - - - - -
chr1 15215 T 2 1 40.00 [0, 1, 0, 0] TC 1.00 - - - - -
chr1 15216 G 2 1 40.00 [0, 1, 0, 0] GC 1.00 - - - - -
chr1 15217 A 2 1 40.00 [0, 1, 0, 0] AC 1.00 - - - - -
Is there a way to limit the output to predicted editing sites only? Or is there a function that uses this table as input to do prediction? If so it is not clear to me from reading the github how to do that. Any assistance would be appreciated!