Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
da7e410
Add option to skip selection hists.
riga Jan 20, 2025
9a6e6d6
Flip stack plot order.
riga Jan 20, 2025
6230f3d
Update flag in law.cfg.
riga Jan 20, 2025
60545a0
Remove forest merge.
riga Jan 20, 2025
1c62af3
Revert selection.
riga Jan 20, 2025
4fbe585
Merge branch 'master' into feature/update_sel_ml_merging
riga Jan 20, 2025
de895f3
Merge branch 'master' into feature/flip_stack_order
riga Jan 20, 2025
12b5d0a
Merge branch 'master' into feature/skip_selection_hists
riga Jan 20, 2025
1e50af1
Apply suggestions from code review
riga Jan 20, 2025
13af3ac
Change default.
riga Jan 20, 2025
1929f81
Merge pull request #606 from columnflow/feature/update_sel_ml_merging
riga Jan 20, 2025
5b5bdf8
Merge branch 'master' into feature/skip_selection_hists
riga Jan 20, 2025
699fad0
Merge pull request #608 from columnflow/feature/skip_selection_hists
mafrahm Jan 20, 2025
8e07f74
Merge branch 'master' into feature/flip_stack_order
riga Jan 20, 2025
7443290
Remove flip_stack option.
riga Jan 20, 2025
e521901
Hotfix notifications from sandboxed tasks.
riga Jan 21, 2025
98ddd50
Merge branch 'master' into feature/flip_stack_order
riga Jan 24, 2025
841ff7d
Merge pull request #607 from columnflow/feature/flip_stack_order
riga Jan 24, 2025
7d6e0c2
Energy calibrations for electrons and photons (#610)
pkausw Jan 24, 2025
ecb6aec
hotfix: make dataset_inst in init_func optional for egamma modules
Jan 27, 2025
a730e2c
hotfix: separate the dataset_inst checks
Jan 27, 2025
20c98ab
correct path only_missing looks for in MergeHistograms
nprouvost Jan 28, 2025
5fee9e4
Merge pull request #613 from columnflow/bugfix_merge_histograms
pkausw Jan 28, 2025
d9d8cff
Hotfix: properly forward and use `deps_kwargs`. (#614)
dsavoiu Jan 29, 2025
03fadcd
Include hist hook for datacards (#539)
pkausw Feb 6, 2025
ca673f0
Hotfix rounding error in datacard for fake data generation.
riga Feb 6, 2025
afdcfb1
Add missing hist hook repr to datacard paths.
riga Feb 6, 2025
6e40750
fix bug where ReduceEvents workflow is submitted twice
mafrahm Feb 6, 2025
4b819ba
prioritise custom style config via command line
maadcoen Feb 6, 2025
e91e148
Merge pull request #617 from GhentAnalysis/style_config_priority
mafrahm Feb 10, 2025
ed7220c
Hotfix use of fake data in inference model.
riga Feb 10, 2025
0cbd2fb
Update input files lepton SFs (#615)
nprouvost Feb 10, 2025
e19c23c
Typo.
riga Feb 10, 2025
8cc9220
Hotfix weight producer info to CreateDatacards.
riga Feb 10, 2025
d8e3b30
Correct input keys for electron Efficiencies (#620)
nprouvost Feb 10, 2025
e18f1df
update sf variations names (or why you should test the code before st…
nprouvost Feb 11, 2025
da70a68
Fix lumi label precision to recommended digits.
riga Feb 11, 2025
2cdd51a
Utility functions and producers for generic delta-R matching (#594)
dsavoiu Feb 18, 2025
4367c55
correct eta calculation trigger sf (#632)
nprouvost Feb 28, 2025
6db27dc
Fix bugs in the cms_minimal analysis template (#636)
lmoureaux Feb 28, 2025
9627406
Fix/template issues (#642)
mafrahm Mar 24, 2025
0d48d70
Fix pu weight extraction. (#654)
riga Mar 27, 2025
3b970bc
Update law.
riga Mar 28, 2025
ce79258
typo on inference task (#655)
JulesVandenbroeck Mar 28, 2025
d97e494
docs: add aalvesan as a contributor for code (#659)
allcontributors[bot] Mar 28, 2025
da1b014
docs: add philippgadow as a contributor for code (#660)
allcontributors[bot] Mar 28, 2025
1bd7fca
add category uniqueness check in create_category_combinations (#611)
mafrahm Mar 31, 2025
384da20
Hotfix category id uniqueness check.
riga Mar 31, 2025
1ebac8c
Fix to jer application on jec variations (#665)
JulesVandenbroeck Apr 28, 2025
ab1372b
Refactoring for 0.3 release (#628)
riga May 27, 2025
578d8b7
Add tmp dir checks, add cf_setup_post_install hook.
riga May 28, 2025
db5f46b
correct json file extension
maadcoen May 28, 2025
54191a5
Hotfix category flattening.
riga May 28, 2025
c21f2c2
Merge pull request #691 from GhentAnalysis/upstream/fix_stats_file_ext
mafrahm May 28, 2025
c542511
normalize to the number of events, the weights are set to 1
Jun 25, 2025
051fbe6
high pileup weights
Jun 25, 2025
db9c146
Merge pull request #16 from DesyTau/anigamova_cf0p3_weights
anigamova Jul 17, 2025
ea29d08
Added support for specifying axis grid
hephysicist Dec 23, 2025
34dd77c
Merge pull request #19 from hephysicist/st_feature_grid
hephysicist Jan 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .all-contributorsrc
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,23 @@
"contributions": [
"code"
]
},
{
"login": "aalvesan",
"name": "Ana Andrade",
"avatar_url": "https://avatars.githubusercontent.com/u/99343616?v=4",
"profile": "https://github.com/aalvesan",
"contributions": [
"code"
]
}, {
"login": "philippgadow",
"name": "philippgadow",
"avatar_url": "https://avatars.githubusercontent.com/u/6804366?v=4",
"profile": "https://github.com/philippgadow",
"contributions": [
"code"
]
}
],
"commitType": "docs"
Expand Down
4 changes: 2 additions & 2 deletions .flake8
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[flake8]

# line length of 100 is recommended, but set it to a forgiving value
max-line-length = 120
# line length of 120 is recommended, but set it to a forgiving value
max-line-length = 121

# codes of errors to ignore
ignore = E128, E306, E402, E722, E731, E741, W504, Q003
Expand Down
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
*.jpeg filter=lfs diff=lfs merge=lfs -text
*.root filter=lfs diff=lfs merge=lfs -text
*.ico filter=lfs diff=lfs merge=lfs -text
*.svg filter=lfs diff=lfs merge=lfs -text
12 changes: 7 additions & 5 deletions .markdownlint
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
plugins:
# disable max line length
md001:
enabled: False
md013:
# disable max line length
enabled: False
md033:
allowed_elements: "!--,![CDATA[,!DOCTYPE,table,h1,p,img"
md024:
siblings_only: True
md026:
punctuation: ".,;!。,;!"
md001:
enabled: False
md033:
allowed_elements: "!--,![CDATA[,!DOCTYPE,table,h1,a,p,img,div"
47 changes: 34 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<h1 align="center">
<img src="https://media.githubusercontent.com/media/columnflow/columnflow/master/assets/logo_dark.png#gh-light-mode-only" width="480" />
<img src="https://media.githubusercontent.com/media/columnflow/columnflow/master/assets/logo_bright.png#gh-dark-mode-only" width="480" />
<img alt="light logo" src="https://media.githubusercontent.com/media/columnflow/columnflow/master/assets/logo_dark.png#gh-light-mode-only" width="480" />
<img alt="dark logo" src="https://media.githubusercontent.com/media/columnflow/columnflow/master/assets/logo_bright.png#gh-dark-mode-only" width="480" />
</h1>

<!-- marker-before-badges -->
Expand Down Expand Up @@ -35,25 +35,39 @@ Original source hosted at [GitHub](https://github.com/columnflow/columnflow).

<!-- marker-before-note -->

## Note on current development
## ❗️ Note on v0.2 → v0.3 transition

This project is currently in a beta phase.
The 0.3 release introduces many performance fixes and new features such as

- a new interface for all *task array functions* (calibrators, selectors, producers, etc.),
- support for plotting data of multiple data taking campaigns at once,
- a simplified machine learning interface, and
- statistical inference models with support for merging data of different campaigns.

However, some of these changes are potentially breaking existing code.
Checkout the [v0.2 → v0.3 transition guide](https://columnflow.readthedocs.io/en/latest/user_guide/02_03_transition.html) as well as the [release notes](https://github.com/columnflow/columnflow/releases/tag/v0.3.0) for a detailed overview of the changes and how to adapt your code.

Version 0.2 continues to be available via the [`legacy/v0.2`](https://github.com/columnflow/columnflow/tree/legacy/v0.2) branch, with the latest release being [v0.2.5](https://github.com/columnflow/columnflow/releases/tag/v0.2.5).

## 🚧 Note on current development

This project is in an advanced beta phase.
The project setup, suggested workflows, definitions of particular tasks, and the signatures of various helper classes and functions are mostly frozen but could still be subject to changes in the near future.
At this point (July 2024), various large-scale analyses based upon columnflow are being developed, and in the process, help test and verify various aspects of its core.
The first major release with a largely frozen API is expected in the fall of 2024.
However, if you would like to join early on, contribute or just give it a spin, feel free to get in touch!
Various large-scale analyses based upon columnflow have been performed, others are being developed, and in the process, help test and verify various aspects of the framework.

<!-- marker-after-note -->

<!-- marker-before-analytics -->

![Columnflow analytics](https://repobeats.axiom.co/api/embed/b6ebc5ba41019de55eb48e195eecb438890442c8.svg "Columnflow analytics")
<div align="center">
<img alt="Columnflow analytics" src="https://repobeats.axiom.co/api/embed/b6ebc5ba41019de55eb48e195eecb438890442c8.svg" />
</div>

<!-- marker-after-analytics -->

<!-- marker-before-body -->

## Quickstart
## Quickstart

To create an analysis using columnflow, it is recommended to start from a predefined template (located in [analysis_templates](https://github.com/columnflow/columnflow/tree/master/analysis_templates)).
The following command (no previous git clone required) interactively asks for a handful of names and settings, and creates a minimal, yet fully functioning project structure for you!
Expand Down Expand Up @@ -103,23 +117,26 @@ Setup successfull! The next steps are:

For a better overview of the tasks that are triggered by the commands below, checkout the current (yet stylized) [task graph](https://github.com/columnflow/columnflow/wiki#default-task-graph).

## Projects using columnflow
## 💯 Projects using columnflow

- [hh2bbtautau](https://github.com/uhh-cms/hh2bbtautau): HH → bb𝜏𝜏 analysis with CMS.
- [hh2bbww](https://github.com/uhh-cms/hh2bbww): HH → bbWW analysis with CMS.
- [topmass](https://github.com/uhh-cms/topmass): Top quark mass measurement with CMS.
- [mttbar](https://github.com/uhh-cms/mttbar): Search for heavy resonances in ttbar events with CMS.
- [analysis playground](https://github.com/uhh-cms/analysis_playground): A testing playground for HEP analyses.
- [analysis playground](https://github.com/uhh-cms/AZH2inv): TODO
- [topsf](https://github.com/uhh-cms/topsf): Top tagging scale factor measurement.
- [hto4l](https://github.com/uhh-cms/hto4l): H → ZZ → 4l analysis with CMS.
- [DiJetJERC](https://github.com/uhh-cms/DiJetJERC): Di-jet analysis with CMS.

## Contributors
## 🙏 Contributors

<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tbody>
<tr>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/riga"><img src="https://avatars.githubusercontent.com/u/1908734?v=4?s=100" width="100px;" alt="Marcel Rieger"/><br /><sub><b>Marcel Rieger</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=riga" title="Code">💻</a> <a href="https://github.com/columnflow/columnflow/pulls?q=is%3Apr+reviewed-by%3Ariga" title="Reviewed Pull Requests">👀</a> <a href="https://github.com/columnflow/columnflow/commits?author=riga" title="Documentation">📖</a> <a href="https://github.com/columnflow/columnflow/commits?author=riga" title="Tests">⚠️</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/riga"><img src="https://avatars.githubusercontent.com/u/1908734?v=4?s=100" width="100px;" alt="Marcel Rieger"/><br /><sub><b>Marcel Rieger</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=riga" title="Code">💻</a> <a href="https://github.com/columnflow/columnflow/pulls?q=is%3Apr+reviewed-by%3Ariga" title="Reviewed Pull Requests">👀</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/mafrahm"><img src="https://avatars.githubusercontent.com/u/49306645?v=4?s=100" width="100px;" alt="Mathis Frahm"/><br /><sub><b>Mathis Frahm</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=mafrahm" title="Code">💻</a> <a href="https://github.com/columnflow/columnflow/pulls?q=is%3Apr+reviewed-by%3Amafrahm" title="Reviewed Pull Requests">👀</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/dsavoiu"><img src="https://avatars.githubusercontent.com/u/17005255?v=4?s=100" width="100px;" alt="Daniel Savoiu"/><br /><sub><b>Daniel Savoiu</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=dsavoiu" title="Code">💻</a> <a href="https://github.com/columnflow/columnflow/pulls?q=is%3Apr+reviewed-by%3Adsavoiu" title="Reviewed Pull Requests">👀</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/pkausw"><img src="https://avatars.githubusercontent.com/u/26219567?v=4?s=100" width="100px;" alt="pkausw"/><br /><sub><b>pkausw</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=pkausw" title="Code">💻</a> <a href="https://github.com/columnflow/columnflow/pulls?q=is%3Apr+reviewed-by%3Apkausw" title="Reviewed Pull Requests">👀</a></td>
Expand All @@ -136,6 +153,10 @@ For a better overview of the tasks that are triggered by the commands below, che
<td align="center" valign="top" width="14.28%"><a href="https://github.com/jomatthi"><img src="https://avatars.githubusercontent.com/u/82223346?v=4?s=100" width="100px;" alt="jomatthi"/><br /><sub><b>jomatthi</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=jomatthi" title="Code">💻</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/JulesVandenbroeck"><img src="https://avatars.githubusercontent.com/u/93740577?v=4?s=100" width="100px;" alt="JulesVandenbroeck"/><br /><sub><b>JulesVandenbroeck</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=JulesVandenbroeck" title="Code">💻</a></td>
</tr>
<tr>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/aalvesan"><img src="https://avatars.githubusercontent.com/u/99343616?v=4?s=100" width="100px;" alt="Ana Andrade"/><br /><sub><b>Ana Andrade</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=aalvesan" title="Code">💻</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/philippgadow"><img src="https://avatars.githubusercontent.com/u/6804366?v=4?s=100" width="100px;" alt="philippgadow"/><br /><sub><b>philippgadow</b></sub></a><br /><a href="https://github.com/columnflow/columnflow/commits?author=philippgadow" title="Code">💻</a></td>
</tr>
</tbody>
</table>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,8 @@


@calibrator(
uses={
deterministic_seeds,
"Jet.pt", "Jet.mass",
},
produces={
deterministic_seeds,
"Jet.pt", "Jet.mass",
"Jet.pt_jec_up", "Jet.mass_jec_up",
"Jet.pt_jec_down", "Jet.mass_jec_down",
},
uses={deterministic_seeds, "Jet.{pt,eta,phi,mass}"},
produces={deterministic_seeds, "Jet.{pt,mass}{,_jec_up,_jec_down}"},
)
def example(self: Calibrator, events: ak.Array, **kwargs) -> ak.Array:
# a) "correct" Jet.pt by scaling four momenta by 1.1 (pt<30) or 0.9 (pt<=30)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@
# named function hooks that can modify store_parts of task outputs if needed
ana.x.store_parts_modifiers = {}

# histogramming hooks, invoked before creating plots when --hist-hook parameter set
ana.x.hist_hooks = {}


#
# setup configs
Expand Down Expand Up @@ -95,7 +98,7 @@
# backgrounds
"tt_sl_powheg",
# signals
"st_tchannel_t_powheg",
"st_tchannel_t_4f_powheg",
]
for dataset_name in dataset_names:
# add the dataset
Expand All @@ -108,11 +111,13 @@
# verify that the root process of all datasets is part of any of the registered processes
verify_config_processes(cfg, warn=True)

# default objects, such as calibrator, selector, producer, ml model, inference model, etc
# default objects, such as calibrator, selector, reducer, producer, ml model, inference model, etc
cfg.x.default_calibrator = "example"
cfg.x.default_selector = "example"
cfg.x.default_selector_steps = []
cfg.x.default_reducer = "cf_default"
cfg.x.default_producer = "example"
cfg.x.default_weight_producer = "example"
cfg.x.default_hist_producer = "cf_default"
cfg.x.default_ml_model = None
cfg.x.default_inference_model = "example"
cfg.x.default_categories = ("incl",)
Expand Down Expand Up @@ -165,15 +170,15 @@

# calibrator groups for conveniently looping over certain calibrators
# (used during calibration)
cfg.x.calibrator_groups = {}
ana.x.calibrator_groups = {}

# producer groups for conveniently looping over certain producers
# (used during the ProduceColumns task)
cfg.x.producer_groups = {}
ana.x.producer_groups = {}

# ml_model groups for conveniently looping over certain ml_models
# (used during the machine learning tasks)
cfg.x.ml_model_groups = {}
ana.x.ml_model_groups = {}

# custom method and sandbox for determining dataset lfns
cfg.x.get_dataset_lfns = None
Expand Down Expand Up @@ -230,7 +235,7 @@
add_shift_aliases(cfg, "mu", {"muon_weight": "muon_weight_{direction}"})

# external files
json_mirror = "/afs/cern.ch/work/m/mrieger/public/mirrors/jsonpog-integration-9ea86c4c"
json_mirror = "/afs/cern.ch/work/m/mrieger/public/mirrors/jsonpog-integration-377439e8"
cfg.x.external_files = DotDict.wrap({
# lumi files
"lumi": {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# coding: utf-8

"""
Example histogram producer.
"""

from columnflow.histogramming import HistProducer
from columnflow.histogramming.default import cf_default
from columnflow.util import maybe_import
from columnflow.config_util import get_shifts_from_sources
from columnflow.columnar_util import Route

ak = maybe_import("awkward")
np = maybe_import("numpy")


# extend columnflow's default hist producer
@cf_default.hist_producer()
def example(self: HistProducer, events: ak.Array, **kwargs) -> ak.Array:
# build the full event weight
weight = ak.Array(np.ones(len(events), dtype=np.float32))

if self.dataset_inst.is_mc and len(events):
for column in self.weight_columns:
weight = weight * Route(column).apply(events)

return events, weight


@example.init
def example_init(self: HistProducer) -> None:
self.weight_columns = {}

if self.dataset_inst.is_data:
return

# store column names referring to weights to multiply
self.weight_columns |= {"normalization_weight", "muon_weight"}
self.uses |= self.weight_columns

# declare shifts that the produced event weight depends on
shift_sources = {"mu"}
self.shifts |= set(get_shifts_from_sources(self.config_inst, *shift_sources))
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def example(self):
"ST",
is_signal=True,
config_process="st",
config_mc_datasets=["st_tchannel_t_powheg"],
config_mc_datasets=["st_tchannel_t_4f_powheg"],
)
self.add_process(
"TT",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def sandbox(self, task: law.Task) -> str:

def datasets(self, config_inst: od.Config) -> set[od.Dataset]:
return {
config_inst.get_dataset("st_tchannel_t_powheg"),
config_inst.get_dataset("st_tchannel_t_4f_powheg"),
config_inst.get_dataset("tt_sl_powheg"),
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def my_plot1d_func(
variable_settings: dict | None = None,
example_param: str | float | bool | None = None,
**kwargs,
) -> tuple(plt.Figure, tuple(plt.Axis,)):
) -> tuple[plt.Figure, tuple[plt.Axis]]:
"""
This is an exemplary custom plotting function.

Expand All @@ -47,10 +47,10 @@ def my_plot1d_func(
"""
# we can add arbitrary parameters via the `general_settings` parameter to access them in the
# plotting function. They are automatically parsed either to a bool, float, or string
print(f"The example_param has been set to '{example_param}' (type: {type(example_param)})")
print(f"the example_param has been set to '{example_param}' (type: {type(example_param)})")

# call helper function to remove shift axis from histogram
remove_residual_axis(hists, "shift")
hists = remove_residual_axis(hists, "shift")

# call helper functions to apply the variable_settings and process_settings
variable_inst = variable_insts[0]
Expand Down
Loading