This document briefly describes the architecture of correctionlib. It is meant to provide a good starting point for new contributors to find their way around the codebase. It assumes some familiarity with correctionlib as a user.
schemav2module: Pydantic models for correctionlib's data structuresCorrectionSetis a list ofCorrectionsCorrectionrepresents a single correction. Itsdataattribute is ofContenttype and represents the root node of the computation graph for this correction. Corrections also have a list of inputs as well as one output of typeVariable(basically a pair of a name and a type, int/float/string) andContentis the type of a node in the computation graph of aCorrection. It's aUnionof the various types of corrections available:Binning,MultiBinning,Category,Formula,FormulaRef,Transform,HashPRNG, float
highlevelmodule: user-facing types (correctionlib.Correctionresolves tocorrectionlib.highlevel.Correction, etc.)CorrectionSetis a list ofCorrections (same as inschemav2but focus is on user API rather than defining the schema/structure of the corrections)CorrectionandCompoundCorrectionwrap the corresponding C++ evaluator and expose theevaluatemethod
_coremodule: a small module that contains the Python facades for the corresponding C++ types, in__init__.pyi.- types are
CorrectionSet,Correction,CompoundCorrectionandVariable - the bindings are declared in
src/python.cc
- types are
include/correction.h and src/correction.cc contain the the C++ types that
perform the actual computations:
- a
Variabletype with a name and a type (string, integer, real) - a
CorrectionSetbuilds a list ofCorrections - the
Correctiontype, which builds a compute graph of correction nodes - types for the different types of nodes in a correction's compute graph, e.g.
Binning,Formula, each with itsevaluatemethod. They are constructed by deserializing a JSON object.Formula::Formula, for example, parses aTFormulaexpression in the JSON and builds the correspondingFormulaAST
In short, the C++ correction objects that perform the actual correction
evaluations are constructed from the JSON representations of the Pydantic types
defined in schemav2.
Let's say the user calls schemav2.Correction.to_evaluator. This:
- constructs a
schemav2.CorrectionSet(the pydantic model) - constructs a
highlevel.CorrectionSetfrom it and immediately extracts the righthighlevel.Correctionfrom it, returning it
The actual construction of the internal C++ correction evaluators happens in the
construction of the highlevel.CorrectionSet, which converts the Pydantic
CorrectionSet to JSON and uses it to construct a _core.CorrectionSet (using
CorrectionSet.from_string)
_core.CorrectionSet.from_stringconstructs a rapidjson JSONObject and callsCorrectionSet(const JSONObject &)- then for each object in JSONObject it constructs a Correction
(
Correction(const JSONObject&)), and puts it inCorrectionSet::corrections_ Correction::Correction(const JSONObject&)setsdata_to the output ofresolve_content, passing the jsonresolve_content(defined in correction.cc) constructs the appropriate type depending on the JSON input (if/else-ing over the known correction types)
