Framework output files MUST be readable outside the framework with minimal overhead (i.e. bare ROOT).
- [Devs]
- By framework output file, do you mean the default framework-produced ROOT file that contains all physics data + framework metadata? Does bare ROOT mean using the root command line without any extra dictionaries than what are already provided by ROOT? Or does it mean the root command line with extra dictionaries? Or something else?
- [DUNE]
- Yes. The preferred behavior is that framework data files (ARTROOT like files) are easily interrogated by users using ROOT without external dependencies or extra dictionaries.
- [Devs]
- For what purposes would you like to read framework-produced data outside of the framework? The extent to which a framework output file can be read outside of a framework depends heavily on the kinds of data to be read and the user-provided layouts of those data. We imagine there are different kinds of data from an output file one may want to read (e.g.):
* Detector raw data * Derived data products * Simulation results * Data provenance * Configuration data * User-provided ancillary data * Persisted (uni- and bi-)directional associations between data products (e.g. the kind of information stored in art::Ptr<T>s and art::Assns<A, B, D>s) If you can provide specific use cases, we can evaluate the feasibility of this request.</dd> - [DUNE]
- [no further response to date]
Requiring that, "framework data files (ARTROOT like files) are easily interrogated by users using ROOT without external dependencies or extra dictionaries" places severe restrictions on the format and content of those files. If we consider specifically the example of ROOT, this means that:
-
While the raw data from the file and hierarchical relationships (e.g. class data members) remain accessible, interface allowing the manipulation of those data (e.g. polar vs cartesian coordinates, or pseudorapidity) are not. Users of the Art framework have always been encouraged to define data products without such interface, but req. 009 would imply that when reading the data with bare ROOT, any such interface would not be available.
-
Art-style I/O-format-agnostic references and associations (e.g.
Ptr,Assns<A, B, D>) are not navigable with bare ROOT.In the old
TTree-based format used by Art, ROOT supports references via bare pointers. However, this has implications for the framework's in-memory representations and would make satisfying other requirements significantly harder (e.g. req. 002, "The framework MUST separate the persistent data representation from the in-memory representation seen by algorithms," and, req. 051, "The framework, through the I/O modules/plugins, MUST provide for the capability of reading from and writing to various types (e.g. ROOT, HDF5, 'object stores')"). Historically frameworks have often forbidden (Art) or severely restricted (CMSSW) the use of C++-style pointers and references in persistent data. Such use would also appear to conflict with req. 007, "Modules within the framework MUST be allowed to contain code from a variety of languages."In the newer ROOT
RNtupleformat, storage only of basic types is expected. Consequently, C++-style pointers and references are explicitly disallowed, and while it would be possible to store IO-format-agnostic references a laart::Ptr, actually using them to access the referenced data would require external code. ROOT-native references a.k.a. links are evisaged (eventually) for RNTuple, but neither the details of that feature nor its timescale for implementation are currently defined. -
In all likelihood, whether any given framework-produced data file satisfies req. 009 will depend on the user-specified structure of the data, and the exact meaning of the phrase 'easily interrogated'—how, and to what end? In any event is unlikely that any useful manifestation of framework-originated metadata (e.g. provenance tracking) would satisfy req. 009 while remaining useful for the purpose for which it was created.