|
| 1 | +# Purpose of this directory |
| 2 | + |
| 3 | +This directory holds an example showing how to convert a |
| 4 | +module that works with the `art` framework into an |
| 5 | +algorithm that works with the `phlex` framework. This |
| 6 | +particular example converts the module defined in |
| 7 | +the file GausHitFinder_module.cc from the LArSoft larreco |
| 8 | +repository. |
| 9 | + |
| 10 | +This is only an example and a prototype to be used in |
| 11 | +testing. It is not intended to be the version of this |
| 12 | +module used by DUNE or any other experiment |
| 13 | +for production or physics. That belongs in some other |
| 14 | +repository. |
| 15 | + |
| 16 | +This example is based on version v0.1.0 of `phlex`. It will need |
| 17 | +some minor modifications to work with the newest version of `phlex`. |
| 18 | +The migration to phlex started with version of GausHitFinder_module.cc |
| 19 | +in v10_05_00 of larreco which was the version in use with DUNE software |
| 20 | +at the time the migration work was started. |
| 21 | + |
| 22 | +# Work in Progress |
| 23 | + |
| 24 | +This is work in progress. It is not final. It is intended |
| 25 | +that this example will change as we learn more about using |
| 26 | +`phlex` and as we receive feedback from DUNE. Currently, this |
| 27 | +is more like a rough first draft of our first attempts |
| 28 | +to use `phlex`. Please don't expect stability. Many features |
| 29 | +of `phlex` have not been implemented and the design of `phlex` |
| 30 | +itself is changing. Those changes will also cause this example |
| 31 | +to change. |
| 32 | + |
| 33 | +As we study this example, it will guide how phlex is used. |
| 34 | +As we run tests, it may point to problems in phlex and this |
| 35 | +may guide which areas developers should focus on to fix |
| 36 | +issues and remove performance bottlenecks. |
| 37 | + |
| 38 | +# Files in this directory |
| 39 | + |
| 40 | +## Interesting files that are the heart of the example |
| 41 | + |
| 42 | +1. find_hits_with_gaussians.hpp |
| 43 | +2. find_hits_with_gaussians.cpp |
| 44 | +3. register_find_hits_with_gaussians.cpp |
| 45 | +4. test_find_hits_with_gaussians.jsonnet |
| 46 | + |
| 47 | +## These files exist only for test purposes |
| 48 | + |
| 49 | +There is one function that will print out reconstructed hits. |
| 50 | +This printout can be used to compare results between an |
| 51 | +`art` process and a `phlex` process running GausHitFinder. |
| 52 | +There are functions that can be used to persistently store |
| 53 | +the input from an `art` process (a vector of `Wire`) so |
| 54 | +it can be used in the `phlex` process. And there is a `phlex` provider |
| 55 | +to read this input. This persistence mechanism is necessary |
| 56 | +because there is not an alternative yet. These files are all |
| 57 | +temporary and not part of the example. |
| 58 | + |
| 59 | +1. print_hits.hpp |
| 60 | +2. print_hits.cpp |
| 61 | +3. wire_serialization.hpp |
| 62 | +4. wire_serialization.cpp |
| 63 | +5. wires_source.cpp |
| 64 | + |
| 65 | +## Files copied from LArSoft |
| 66 | + |
| 67 | +The infrastructure does not yet exist to link the |
| 68 | +code here with LArSoft code or include headers. |
| 69 | +Necessary files are copied in with minimal changes. |
| 70 | +Some changes are necessary because the files are in |
| 71 | +a different location and some remove unwanted dependences, |
| 72 | +but the modifications are small. When it is possible |
| 73 | +to depend on LArSoft code in some other repository, |
| 74 | +these files should be deleted to remove duplication. |
| 75 | + |
| 76 | +These files are in the subdirectory: |
| 77 | + |
| 78 | + ```phlex-examples/gaus_hit_finder/copied_from_larsoft_minor_edits``` |
| 79 | + |
| 80 | +# Where more work is needed |
| 81 | + |
| 82 | +At the moment, this is implemented as a single `phlex` |
| 83 | +transform. Does it make sense to split it with unfolds |
| 84 | +and folds? Does it make sense to have multiple transforms? |
| 85 | +We don't know the answers to these questions yet. |
| 86 | +Getting the single transform to work is a good first step. |
| 87 | + |
| 88 | +The next thing we plan to work on is creating a second version |
| 89 | +of the migrated module. We will try replacing the |
| 90 | +outer `parallel_for` with an `unfold`, `transform`, and `fold` |
| 91 | +sequence of algorithms. After that we will create other versions |
| 92 | +pushing the division into separate algorithms to lower levels. |
| 93 | +Then we plan to run tests and compare the different versions. |
| 94 | + |
| 95 | +`Phlex` will not support the `Tools` feature that existed in |
| 96 | +`art`. One possibility is that algorithm nodes scheduled |
| 97 | +by `tbb::flow_graph` offer sufficient configurability that we |
| 98 | +don't need a separate plugin system like `Tools`. |
| 99 | +Another possibility is that we eventually implement a |
| 100 | +plugin system for `phlex`. For now, I replaced the two `Tool` |
| 101 | +types used in `GausHitFinder` with direct instantiations of one |
| 102 | +of the plugin types. I instantiated a vector of objects of type |
| 103 | +`CandHitStandard` and one object of type `PeakFitterMrqdt`. They |
| 104 | +needed some modification but are as close as reasonably possible |
| 105 | +to the original versions. |
| 106 | + |
| 107 | +I do not know yet how to deal with the fact that `GausHitFinder` |
| 108 | +can be configured to produce 1 or 2 output data products. |
| 109 | +For now, the example can only produce one `std::vector<recob::Hit>`. |
| 110 | +It's configurable whether to filter hits or not, but it cannot |
| 111 | +produce both outputs at the same time. |
| 112 | + |
| 113 | +For now, this ignores the fact that `GausHitFinder` also can |
| 114 | +produce `art::Assns<recob::Wire, recob::Hit>` and |
| 115 | +`art::Assns<raw::RawDigit, recob::Hit>`. In the `art` version, |
| 116 | +it is configurable whether to produce those or not. |
| 117 | +We are ignoring this because `phlex` does not yet support `art::Assns` |
| 118 | +or whatever will be used to replace that type. For now, the |
| 119 | +example can only produce one `std::vector<recob::Hit>`. |
| 120 | + |
| 121 | +Here is a list of the files that were used as a basis |
| 122 | +for this example, but significantly modified from the |
| 123 | +LArSoft version: |
| 124 | + |
| 125 | +``` |
| 126 | +https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/GausHitFinder_module.cc |
| 127 | +``` |
| 128 | + |
| 129 | +Some files were copied with minimal modifications. The "tool" |
| 130 | +classes were split into .h and .cxx files and are no longer implemented |
| 131 | +as `art Tools`, but otherwise the changes are minimal. |
| 132 | +Here is a list of the files that were copied from LArSoft |
| 133 | +with minimal modifications: |
| 134 | + |
| 135 | +``` |
| 136 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFinderTools/CandHitStandard_tool.cc |
| 137 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFinderTools/PeakFitterMrqdt_tool.cc |
| 138 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFilterAlg.h |
| 139 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFilterAlg.cxx |
| 140 | + https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/Wire.h |
| 141 | + https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/Wire.cxx |
| 142 | + https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/Hit.h |
| 143 | + https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/Hit.cxx |
| 144 | + https://github.com/LArSoft/larcoreobj/blob/develop/larcoreobj/SimpleTypesAndConstants/geo_types.h |
| 145 | + https://github.com/LArSoft/larcoreobj/blob/develop/larcoreobj/SimpleTypesAndConstants/geo_types.cxx |
| 146 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFinderTools/ICandidateHitFinder.h |
| 147 | + https://github.com/LArSoft/larcoreobj/blob/develop/larcoreobj/SimpleTypesAndConstants/RawTypes.h |
| 148 | + https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/Utilities/sparse_vector.h |
| 149 | + https://github.com/LArSoft/larvecutils/blob/develop/larvecutils/MarqFitAlg/MarqFitAlg.h |
| 150 | + https://github.com/LArSoft/larvecutils/blob/develop/larvecutils/MarqFitAlg/MarqFitAlg.cxx |
| 151 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFinderTools/ICandidateHitFinder.h |
| 152 | + https://github.com/LArSoft/larreco/blob/develop/larreco/HitFinder/HitFinderTools/IPeakFitter.h |
| 153 | +``` |
| 154 | + |
| 155 | +For now, the part of `GausHitFinder` that fills 2 histograms is removed. |
| 156 | +Why? Because `phlex` does not yet have a histogramming facility like `TFileService` |
| 157 | +yet. Also those histograms were booked in `beginJob`. That entire method |
| 158 | +was removed. Booking the histograms was the only thing that was done in `beginJob`. |
| 159 | +I'm not sure how to support `beginJob` activity in `phlex` either. |
| 160 | + |
| 161 | +Dropped `fEventCount`. It was not used. Seems like not a good thing |
| 162 | +to keep track of in a `phlex` algorithm anyway. Reconstruction algorithms |
| 163 | +should not depend on event count. Also removed the related count argument |
| 164 | +from the findHitCandidates method of ICandidateHitFinder and its implementations. |
| 165 | +It was not used in CandHitStandard anyway. |
| 166 | + |
| 167 | +MessageLogger calls are commented out for now but not deleted. Eventually |
| 168 | +we will want to replace those with calls to whatever logging system `phlex` |
| 169 | +eventually uses or delete those lines entirely. |
| 170 | + |
| 171 | +For now, calls to the Geometry service are commented out but not deleted. |
| 172 | +The current plan is to replace those with a `phlex` provider that provides the geometry |
| 173 | +data as data products and these providers will be scheduled by the flow graph. |
| 174 | +In the main algorithm the geometry is used to get the plane number for each |
| 175 | +wire. For now, I just hard coded the plane number to 0. The signal type in |
| 176 | +the `recob::Hit` objects is also from the geometry and for now always set |
| 177 | +to 0. Similarly, the `WireID` in the `recob::Hit` objects is from the geometry |
| 178 | +and for now set to a default value. `FillOutHitParameterVector` depends |
| 179 | +on the geometry also, but it only checks the size of a configuration |
| 180 | +vector against the number of planes. If this is skipped but the |
| 181 | +configuration vector is sized to match the number of planes, then things |
| 182 | +should still work ok. For now, I just skip FillOutHitParameterVector. |
| 183 | + |
| 184 | +I copied in the content of the `TMath::Gaus` function from ROOT to |
| 185 | +avoid depending on ROOT in `phlex` code for now (slightly edited). |
| 186 | +Are we allowed to depend on ROOT in `phlex` algorithm code? That could |
| 187 | +easily be restored if the build system allows it. |
| 188 | + |
| 189 | +# Testing |
| 190 | + |
| 191 | +This was tested by running `GausHitFinder` with the `art` framework and running the modified `GausHitFinder` with the `phlex` framework. The two processes used identical input (the `std::vector<Wire>`). The output `std::vector<recob::Hit>` objects were compared using a text file printed during each process. The output was identical except for the following: |
| 192 | + |
| 193 | +Two output data members of `Hit` were ignored because they depend on the `Geometry` and that is not implemented yet for `phlex`. In the `phlex` version, these data members are filled with default values. |
| 194 | +``` |
| 195 | + geo::SigType_t fSignalType; ///< signal type for the plane of the hit |
| 196 | + geo::WireID fWireID; ///< WireID for the hit (Cryostat, TPC, Plane, Wire) |
| 197 | +``` |
| 198 | + |
| 199 | +The `phlex` process was run with multithreading and that causes the order of `Hit` objects to vary from one execution to the next and also the order of events to vary. In the comparison the `Hit` objects were sorted and we had to be careful to compare matching events. |
| 200 | + |
| 201 | +Contact the `phlex` group if you are interested in details related |
| 202 | +to running `art` side of this test or want the input files (or |
| 203 | +to regenerate them). |
0 commit comments