Releases: gprocunier/calabi
v1.3.1
Calabi v1.3.1 (2026-04-28)
Maintenance release focused on the Cockpit observer zram visualization and the
zram writeback policy fix that enables the backing store to function correctly.
What Changed
zram Writeback Policy Fix
- Changed the default writeback policy mode from
incompressibletohuge.
Theincompressiblekeyword was silently accepted by the kernel but never
triggered actual writeback, leaving the backing store unused. Thehuge
mode is a valid kernel writeback type that correctly identifies and offloads
pages that did not compress well. - Updated the writeback policy template, role defaults, role validation,
vars, override examples, and all documentation references.
Cockpit Observer: Two-Tier zram Visualization
- Replaced the single "Swap Occupancy" gauge with a two-tier "Swap Capacity"
card showing separate RAM-tier and backing-tier occupancy meters when a
writeback backing store is active. - Split the three-segment utilization bar into four segments when writeback is
active: RAM cost, compression avoided, writeback offloaded, and free logical
capacity. Falls back to the original three-segment view when no backing
store is configured. - Split the "RAM Avoided" card subtext and the memory overview "Reclaim
Contribution" card into compression savings vs. writeback offload. - Adjusted memory pressure scoring to reduce the zram-utilization penalty when
a backing store is active with headroom. - Added
backing_dev_size_bytesto the collector (auto-detected from sysfs)
and emitted it ascalabi_zram_backing_dev_size_bytesin the Prometheus
exporter.
Cockpit Observer: Dark Mode Contrast
- Added a theme-aware dual color palette (
COLORS_LIGHT/COLORS_DARK)
with amatchMedialistener that switches automatically. - Updated gauge rendering to pass theme-aware track, text, and label colors
instead of hardcoded values. - Added dark mode CSS overrides for verdict badges, error states, and legend
dots. - Updated sparkline.js gauge defaults for better visibility in both modes.
Documentation
- Added a "Two-Tier Swap Model" section to
host-memory-oversubscription.mdexplaining how the RAM tier and backing
tier interact, with a worked example from a running lab. - Updated the Cockpit observer
INTERPRETING.mdwith four-segment capacity
bar and two-tier swap capacity card documentation. - Added
bd_statwriteback validation to the operational validation checklist.
Release Status
- Release tag:
v1.3.1 - Supersedes:
v1.3.0 - Cockpit observer version:
1.3.0 - AWS remains the primary validated deployment path
- The on-prem path remains experimental with cluster-capable profiles
Key Entry Points
- Host memory policy:
aws-metal-openshift-demo/docs/host-memory-oversubscription.md - Writeback policy role:
aws-metal-openshift-demo/roles/lab_host_memory_oversubscription - Writeback policy template:
aws-metal-openshift-demo/roles/lab_host_memory_oversubscription/templates/zram-writeback-policy.sh.j2 - Observer plugin:
cockpit/calabi-observer/ - Observer interpretation guide:
cockpit/calabi-observer/INTERPRETING.md - Prometheus exporter:
cockpit/calabi-observer/calabi_exporter.py - On-prem override examples:
on-prem-openshift-demo/inventory/overrides/
Notes
- The writeback policy fix is the most operationally significant change in this
release. Hosts runningv1.3.0with a configured backing device were not
writing back to it because the kernel silently acceptedincompressibleas a
writeback mode without performing any writeback. Upgrading tov1.3.1and
re-applying the memory policy (or manually editing the deployed script) is
sufficient to activate the backing store. - The Cockpit observer gracefully degrades when no backing store is configured:
the utilization bar, swap capacity card, and reclaim contribution all fall
back to the pre-writeback single-tier view. - All backing store detection is auto-derived from sysfs at runtime. No
hardcoded device paths, sizes, or NVMe assumptions exist in the observer
code.
Calabi v1.3.0
Calabi v1.3.0 (2026-04-26)
This feature release moves the experimental on-prem path from a
support-services-only proof point toward a cluster-capable workflow, while also
adding external ODF consumption and stronger rerun/convergence behavior across
the shared AWS/on-prem automation.
What Changed
External ODF Consumption
- Added
openshift_post_install_odf_mode: externalas a first-class day-2
storage mode. - Added
openshift_post_install_odf_external, which:- installs the ODF operator
- imports external Ceph cluster details from a file or base64 payload
- creates the external cluster-details secret
- applies the external
StorageCluster - accepts imported-cluster health states such as
Connected
- Kept external ODF in the same day-2 ordering slot as internal ODF so
Keycloak, AAP, NetObserv, and later storage consumers do not gain new
dependency edges. - Added optional OVN host-routing convergence for external Ceph endpoints:
routingViaHost: trueandipForwarding: Global. - Added effective ODF storage class variables so downstream services do not
hard-code internal ODF storage class names.
On-Prem Cluster-Capable Profiles
- Expanded
on-prem-openshift-demowith override-driven deployment profiles
for support services, AD, 128 GiB host sizing, and a 3-control-plane /
3-worker external-Ceph cluster. - Added extra OVS bridge and libvirt network support for routed storage
networks and jumbo-frame external Ceph access. - Added on-prem host sizing assessment, full cleanup automation, LVM guest
storage helpers, and deterministic/dev/ebs/*compatibility rules. - Added the on-prem override reference:
on-prem-openshift-demo/docs/override-mechanism.md. - Documented the difference between
enable_*phase toggles andforce_*
rerun toggles.
Rerun And Convergence Hardening
- Added support-service convergence probes so reruns can skip already-healthy
AD, IdM, bastion, bastion-join, and mirror-registry phases. - Added OpenShift cluster convergence checks so a healthy existing cluster does
not get rebuilt by accident. - Added day-2 phase health derivation and recursive default normalization for
post-install roles so partial overrides do not drop nested defaults. - Made destructive ODF recovery and other repair paths explicit through force
flags instead of normal rerun behavior. - Improved cluster cleanup paths, including safer handling of missing disk
definitions and OpenShift-only teardown boundaries.
Mirror Registry, Tooling, And Guest Sizing
- Added a reusable
calabi_shellrole and installedcalabi-shell
system-wide after RHSM/package access is available. - Added
oc-mirrorparallelism tunables:mirror_registry_oc_mirror_parallel_imagesmirror_registry_oc_mirror_parallel_layers
- Kept the default mirror workflow portable: mirror-to-disk followed by
disk-to-mirror. - Updated support guest sizing for the current lab profile:
- bastion: 8 GiB
- mirror-registry: 8 GiB, 4 vCPU
- IdM: 8 GiB, 4 vCPU
- Added host zram writeback policy support and documented how KSM, zram, and
Kubernetes requested-memory scheduling interact.
Documentation And Docs Site
- Refreshed AWS and on-prem automation-flow and manual-process docs to match
the current orchestration order and handoff model. - Added detailed external ODF, override, day-2 continuation, teardown, mirror,
resource-sizing, and NetObserv/AAP scheduling guidance. - Updated the docs site renderer with Shiki-backed code highlighting and
package metadata for the Node dependency. - Sanitized the publish/export path so local external-Ceph payloads,
generated files, RPM/SRPM build artifacts, temporary files, and local logs do
not survive into the release tree.
Cockpit Observer
- Refreshed the Cockpit observer UI and supporting assets.
- Added Prometheus/exporter-oriented service files and helper components.
- Removed old generated RPM/SRPM artifacts from the published tree; RPMs
should be produced from source when needed rather than committed as build
outputs.
Release Status
- Release tag:
v1.3.0 - Supersedes:
v1.2.1 - Current release for the expanded on-prem/external-ODF automation and refreshed
documentation set - AWS remains the primary validated deployment path
- The on-prem path remains experimental, but now includes cluster-capable
profiles and clearer operator runbooks
Key Entry Points
- Main lab entry:
aws-metal-openshift-demo/README.md - Main docs map:
aws-metal-openshift-demo/docs/README.md - On-prem docs map:
on-prem-openshift-demo/docs/README.md - On-prem override reference:
on-prem-openshift-demo/docs/override-mechanism.md - Aggregated AWS flow:
aws-metal-openshift-demo/playbooks/site-lab.yml - On-prem remote bastion wrapper:
on-prem-openshift-demo/scripts/run_remote_bastion_playbook.sh - External ODF role:
aws-metal-openshift-demo/roles/openshift_post_install_odf_external
Notes
v1.3.0is a large release because it captures accumulated work since
v1.2.1, including automation, documentation, Cockpit observer, and publish
sanitization updates.- The external-Ceph example override in the published tree contains an empty
cluster-details placeholder. Operators must provide their own external Ceph
details before using that profile. - The release tree intentionally excludes local generated artifacts, RPM/SRPM
outputs,node_modules, secret material, and operator-local logs.
Calabi v1.2.1
Calabi v1.2.1 (2026-04-18)
This maintenance release hardens bootstrap sequencing after v1.2.0, with the
most important fixes centered on authoritative DNS bring-up and the new
on-prem staged runner flow.
What Changed
DNS Bootstrap Reliability
- Taught the hypervisor uplink bootstrap to prefer authoritative IdM DNS as
soon as it is reachable, instead of hard-coding a single transition target. - Taught bastion host-side provisioning and bastion guest bootstrap to derive
DNS servers from the current bootstrap stage, preserving fallback behavior
until IdM DNS is actually available. - Reduced early bootstrap failures where bastion or hypervisor work could race
ahead of authoritative DNS availability.
Bastion And Guest Staging Hardening
- Added more robust bastion guest RHSM registration handling for both
activation-key and username/password paths. - Improved host-side bastion disk reseed handling by resolving the real block
device path and tolerating benign partition reread behavior. - Tightened supporting guest and host bootstrap tasks around staged package and
service preparation.
On-Prem Runner Flow
- Fixed on-prem bastion-stage inventory/runtime issues that could break
wrapper-drivensite-bootstrapruns. - Added on-prem runner scripts for tracked workstation, bastion, and remote
bastion execution:on-prem-openshift-demo/scripts/run_local_playbook.shon-prem-openshift-demo/scripts/run_bastion_playbook.shon-prem-openshift-demo/scripts/run_remote_bastion_playbook.sh
- Added
on-prem-openshift-demo/scripts/lab-dashboard.shsupport for tracked
runner state and operator visibility.
Docs And Publish Tree
- Updated AWS and on-prem docs to use tracked runner wrappers as the operator
entrypoints for the automation flow. - Published sanitized inventory defaults in the release tree so GitHub-facing
content does not carry local operator addresses or lab credentials. - Kept the validated AWS deployment path intact while documenting the on-prem
path as the alternate staged target.
Release Status
- Release tag:
v1.2.1 - Current validated release for the sanitized publish tree
- Includes bootstrap DNS reliability fixes and on-prem staged-flow hardening
Key Entry Points
- Main lab entry:
aws-metal-openshift-demo/README.md - Main docs map:
aws-metal-openshift-demo/docs/README.md - On-prem docs map:
on-prem-openshift-demo/docs/README.md - AWS bootstrap wrapper:
aws-metal-openshift-demo/scripts/run_local_playbook.sh - On-prem bootstrap wrapper:
on-prem-openshift-demo/scripts/run_local_playbook.sh
Notes
v1.2.1supersedesv1.2.0as the current release tag.- The GitHub Pages docs workflow remains functional on this release line.
Calabi v1.2.0
Calabi v1.2.0 (2026-04-09)
This release adds an experimental on-prem deployment mode while keeping the
validated AWS-target path intact.
What Changed
Experimental On-Prem Deployment Mode
- Added a new on-prem subtree:
on-prem-openshift-demo/
- Added on-prem entrypoints for:
playbooks/site-bootstrap.ymlplaybooks/site-lab.yml
- Added an on-prem host bootstrap path that assumes:
- a preinstalled RHEL hypervisor
- an operator-provided LVM volume group for guest storage
- Added on-prem guest disk provisioning that:
- validates volume-group existence
- validates free space before
lvcreate - creates the expected guest logical volumes
- publishes
/dev/ebs/*compatibility symlinks
AWS-Safe Isolation
- Reworked the on-prem implementation so the validated AWS codepath stays
pristine. - Kept all on-prem-specific behavior in local wrappers and on-prem-local
playbooks instead of modifyingaws-metal-openshift-demo/. - Added an explicit on-prem bastion-to-hypervisor handoff model with:
on_prem_bastion_hypervisor_hoston_prem_bastion_hypervisor_user
- Removed the runtime requirement for
ec2-useron the on-prem hypervisor.
On-Prem Docs
- Added an on-prem docs set under:
on-prem-openshift-demo/docs/
- Covered the early steps that differ materially from AWS:
- prerequisites
- automation flow
- manual process
- host sizing and resource policy
- portability and gap analysis
- Marked the on-prem path as experimental in the source docs.
- Added explicit handoff points back to the main AWS docs once bastion staging
is complete. - Tightened the on-prem prose so it reads like operator guidance rather than an
analysis memo.
GitHub Pages
- Added the on-prem docs to the rendered site as first-class Pages routes.
- Added an experimental on-prem entry from the main site flow while keeping the
primary top-level navigation unchanged:OPEN THE LABDOCS MAP
- Surfaced the on-prem path from:
- the repo root landing page
- the docs map
- Added a Pages-side experimental treatment for the on-prem subtree and kept
the docs handoff back to the main AWS docs clear.
Release Status
- Release tag:
v1.2.0 - Current validated clean-deploy release
- The validated AWS-target deployment path remains the primary release path
- The on-prem mode is included as an experimental alternate target
Key Entry Points
- Main lab entry:
aws-metal-openshift-demo/README.md - Main docs map:
aws-metal-openshift-demo/docs/README.md - On-prem docs map:
on-prem-openshift-demo/docs/README.md - On-prem bootstrap:
on-prem-openshift-demo/playbooks/site-bootstrap.yml - On-prem lab entry:
on-prem-openshift-demo/playbooks/site-lab.yml
Notes
v1.2.0supersedesv1.1.0as the current release tag.- The AWS-target path remains the validated baseline.
- The GitHub Pages workflow is functional, but the stock Actions dependencies
still emit a Node 20 deprecation advisory that should be cleaned up in a
future maintenance change.
Calabi v1.1.0
Calabi v1.1.0 (2026-04-09)
This release captures the merge of the calabi-ad-services feature branch into
main, plus the validation and documentation work needed to cut a clean
release from it.
What Changed
AD Services And Trust Flow
- Added the AD support-service path to the orchestration flow.
- Formalized the support-service order around:
- AD server
- IdM
- IdM/AD trust
- bastion join
- Codified the AD/IdM bridge data and the trust-side group mapping model that
feeds downstream auth consumers.
Authentication Model
- Kept OpenShift on the validated auth baseline of:
HTPasswdbreakglass- Keycloak OIDC
- group-based RBAC
- Replaced AAP direct LDAP auth with Keycloak OIDC as the clean-build path.
- Validated AD-backed user login to AAP on:
- the repaired in-place deployment path
- a clean AAP teardown and redeploy
Orchestration Hardening
- Hardened bastion-local generated workspace ownership handling for:
generated/ocpgenerated/tools
- Fixed stale tool-path and helper-path assumptions in post-install validation
and installer-binary publication. - Added bounded recovery to day-2 roles where a single bad pod or daemonset
member could strand a long deployment, including:- NMState
- Web Terminal
- AAP
- virtualization handler rollout
- Fixed multiple fresh-deploy defects discovered during validation runs, such as:
- missing mirror-registry Podman drop-in directory creation
- install-wait assumptions about rendezvous metadata
- post-install variable ordering and fact-default issues
Documentation And Pages
- Refreshed the runbooks and architecture docs to match the current validated
deployment shape. - Reworked
manual-process.mdto reflect the real support-service order,
trust checkpointing, clean-redeploy guidance, and the current auth baseline. - Published the GitHub Pages site for the docs set.
- Tightened the Pages structure around the repo’s authored reading flow:
- repo root README as the site entrypoint
OPEN THE LABDOCS MAP
- Fixed Mermaid rendering, linked inline repo paths back to source, and
cleaned up oversized or noisy rendered diagrams.
Release Status
- Release tag:
v1.1.0 - Current validated clean-deploy release
- Clean deployment confirmed on the current codebase
Key Entry Points
- Lab entry point:
aws-metal-openshift-demo/README.md - Docs map:
aws-metal-openshift-demo/docs/README.md - Build/rebuild order:
aws-metal-openshift-demo/docs/automation-flow.md - Manual runbook:
aws-metal-openshift-demo/docs/manual-process.md - Auth model:
aws-metal-openshift-demo/docs/authentication-model.md
Notes
v1.1.0supersedesv1.0.0as the current validated release.- The GitHub Pages workflow is functional, but the stock Actions dependencies
still emit a Node 20 deprecation advisory that should be cleaned up in a
future maintenance change.
Calabi v1.0.0
Calabi v1.0.0 (2026-04-05)
This is the first tagged release of Calabi.
Calabi is an Ansible-driven, single-host, fully disconnected OpenShift 4 lab
built on nested KVM. It is designed to let you demonstrate and iterate on
production-patterned installer and day-2 workflows while keeping the
infrastructure shape realistic.
What’s Included
aws-metal-openshift-demo/: the main lab implementation (AWS scaffolding,
hypervisor bootstrap, support guests, disconnected OpenShift install, day-2).cockpit/calabi-observer/: Cockpit plugin providing real-time observability
for the host resource management system onvirt-01(RPM spec included).- Documentation map and deep-dive guides under
aws-metal-openshift-demo/docs/.
Default Guest Sizing (Current)
- OpenShift
4.20.15 - 3 masters:
8 vCPU / 24 GiB - 3 infra:
16 vCPU / 64 GiB - 3 workers:
12 vCPU / 16 GiB
These values come from aws-metal-openshift-demo/vars/guests/openshift_cluster_vm.yml
and are discussed in aws-metal-openshift-demo/docs/host-resource-management.md.
See aws-metal-openshift-demo/docs/prerequisites.md for controller and input
expectations (including ansible-core 2.18 and a RHEL 10.1 guest image source).
Getting Started
- Entry point:
aws-metal-openshift-demo/README.md - Docs map:
aws-metal-openshift-demo/docs/README.md - Build/rebuild run order:
aws-metal-openshift-demo/docs/automation-flow.md
Security And Secrets
Calabi intentionally references secret inputs by path and keeps live credential
material out of Git. Start with:
aws-metal-openshift-demo/docs/secrets-and-sanitization.md
Notes
- This is an inaugural release; there are no prior version tags to upgrade
from. Future releases should add new entries toCHANGELOG.md.