Figure 2 compares prediction-score distributions across truth categories. The figure starts from scored candidate sites rather than from the aggregate benchmark summaries. This is necessary because the plot compares score distributions for several site classes: experimentally validated true off-target sites, experimentally tested false sites, sites with unknown truth status, and candidate sites that were scored by a tool but not experimentally sequenced.
Two tables provide these inputs. figure_2_prediction_score_categories.csv contains one row per scored candidate site and tool, with the raw tool score and the truth category assigned to that site. figure_2_true_false_score_significance.csv summarizes the statistical separation between scores assigned to validated true and false sites for each tool. The first table is used to draw the score distributions; the second table records the corresponding true-versus-false comparison.
The plotted categories are true, false, unknown, and not_sequenced when present for a given tool. The score scale is not harmonized across tools; each panel should be read within tool.