bazel/macos - fix linker problems, swig and regression for macos by fredowski · Pull Request #9480 · The-OpenROAD-Project/OpenROAD

fredowski · 2026-02-16T20:28:26Z

I fixed some parts of the bazel build on macos to be able to build the openroad commandline version and to run the regression.

bazelisk run //:openroad

I start the regression with

bazelisk test //src/...

and have

Executed 1117 out of 1157 tests: 1096 tests pass and 61 fail locally.

Most failures are in gpl (33), rmp (8) with some (small) deviations in numbers. This is what claude says:

  Full Regression Results: 1096 passed, 61 failed out of 1157 tests

  Failure categories:

  ┌────────┬───────┬──────────────────────────────────────────────────────────────────────────┐
  │ Module │ Count │                                Root Cause                                │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ gpl    │ 33    │ Placement coordinate diffs (macOS ARM floating point differences)        │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ rmp    │ 8     │ Likely similar numerical diffs from restructuring/ABC                    │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ rsz    │ 3     │ Placement coordinate diffs (e.g. 96726 vs 96673)                         │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ psm    │ 2     │ Numerical diffs                                                          │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ odb    │ 1     │ md5sum command not found on macOS (needs md5 instead)                    │
  ├────────┼───────┼──────────────────────────────────────────────────────────────────────────┤
  │ utl    │ 1     │ Metrics server socket shutdown error (Boost.Asio test environment issue) │
  └────────┴───────┴──────────────────────────────────────────────────────────────────────────┘

  Analysis:

  - None of the failures are caused by our changes (the SWIG fix or the Tcl runfiles fix)
  - ~58 failures are cross-platform numerical differences — placement coordinates differ slightly between macOS ARM and the Linux x86
  reference. These are expected when running on a different platform.
  - 1 failure (odb:write_def58_gzip) is a macOS compatibility issue — the test uses md5sum which doesn't exist on macOS (it's md5 instead).
  - 1 failure (utl:utl_cfile_unittest) is a Boost.Asio socket test issue in the metrics server test, unrelated to our changes.

I can also run the flow regression with

MacBookProM4-2:OpenROAD fritz$ OPENROAD_EXE=/Users/fritz/OpenROAD/bazel-bin/openroad test/regression flow
------------------------------------------------------
Flow
gcd_nangate45 (tcl) pass
gcd_sky130hd (tcl) *FAIL* RSZ::repair_design_buffer_count    2 >    1 ; DRT::clock_skew  0.04 >  0.04
gcd_sky130hs (tcl) pass
gcd_asap7 (tcl) pass
ibex_sky130hd (tcl) *FAIL* DRT::max_slew_slack -39% < -15% ; DRT::max_capacitance_slack -44% <   0%
ibex_sky130hs (tcl) *FAIL* DRT::max_slew_slack -18% <  -2% ; DRT::max_capacitance_slack -20% <  -5%
aes_nangate45 (tcl) *FAIL* RSZ::max_capacitance_slack -25% <   0% ; DRT::clock_skew  0.10 >  0.09 ; DRT::max_capacitance_slack -30% < -19%
aes_sky130hd (tcl) *FAIL* DRT::clock_skew  0.64 >  0.56
aes_sky130hs (tcl) *FAIL* RSZ::hold_buffer_count  465 >  128
aes_asap7 (tcl) *FAIL* DRT::worst_slack_max -206.71 <= -117.20 ; DRT::max_slew_slack -118% < -31%
tinyRocket_nangate45 (tcl) pass
jpeg_sky130hs (tcl) *FAIL* DRT::ANT::errors   2 >   0
jpeg_sky130hd (tcl) *FAIL* DRT::max_slew_slack -23% < -20% ; DRT::max_capacitance_slack -23% <  -1%
------------------------------------------------------
Failed 9/13

and see some parts failing. I run this on a MacbookPro M4 with macos 15.7.3 (24G419)

I made some changes for the tcl migration on the sta and the abc submodule. In the meantime abc has changed and it works right away with the new abc version. OpenSTA needs to choose the MachineApple.cc on MacOS. The gui build on MacOS does not work. So this is only commandline and regression.

The OpenSTA changes are here: https://github.com/fredowski/OpenSTA/tree/bazel-macos
abc changes for the OpenRoad abc version: https://github.com/fredowski/abc/tree/bazel-macos (not needed)

gemini-code-assist

Code Review

This pull request successfully migrates the Tcl dependency from rules_hdl to the Bazel Central Registry (BCR), a significant step in modernizing the build system. It also introduces several fixes to enable building on macOS, including necessary linker flags for Python extensions and compiler definitions. The changes are extensive but largely consist of mechanical updates to BUILD files, replacing the old Tcl dependency. Additionally, a test script has been updated for better cross-platform compatibility. My review includes a suggestion to further improve the robustness of this test script's file handling.

gemini-code-assist · 2026-02-16T20:30:09Z

+set f0 [open $out_def r]
+set def0 [read $f0]
+close $f0
+
+set f1 [open $unzipped_def r]
+set def1 [read $f1]
+close $f1


The change to compare file contents directly is great for cross-platform compatibility. However, the file handling can be made more robust. If an error occurs during the read command, the corresponding close command will not be executed, leading to a resource leak. It's better to use try...finally blocks to ensure files are always closed.

set f0 [open $out_def r] try { set def0 [read $f0] } finally { close $f0 } set f1 [open $unzipped_def r] try { set def1 [read $f1] } finally { close $f1 }

maliberty · 2026-02-16T21:15:08Z

Thanks for contributing. Commits need to be signed to pass DCO (use git commit -s to sign). To fix the commits in this PR click the DCO link for info.

github-actions · 2026-02-16T21:16:18Z

clang-tidy review says "All clean, LGTM! 👍"

fredowski · 2026-02-16T22:34:00Z

Thanks for contributing. Commits need to be signed to pass DCO (use git commit -s to sign). To fix the commits in this PR click the DCO link for info.

I pushed the signed versions. Thanks for working on OpenROAD!

hzeller · 2026-02-17T17:22:27Z

Thanks for contributing!

I think last time I checked, the BCR tcl version does not work fully as it fails to find its init.tcl and other files of its standard library:

$ bazel run @tcl_lang//:tclsh
...
initialization failed: Cannot find a usable init.tcl in the following directories: 
    ....

The BCR version needs to be changed to find these, and I am playing with some options how to do that (mostly involving resolving runfiles) to then upstream to BCR. Maybe we should await that fix (but I will not get to it before the weekend).

Maybe the issues you see in the failures can be explained with that ?
(also CC: @QuantamHD )

stefanottili · 2026-02-17T17:47:41Z

I was a little surprised about the regression differences.
Reading up on x86/arm floating point differences I found this:
https://learn.arm.com/learning-paths/cross-platform/floating-point-behavior.

> Understand that Arm and x86 produce identical results for all well-defined floating-point operations.
> Recognize that differences only occur in special undefined cases permitted by IEEE 754.

The lowest hanging fruit to make arm behave like x86 is:
> Disable fused multiply-add operations using the -ffp-contract=off compiler flag.

This also works the other way around:
> Use the compiler flag -ffp-contract=fast to enable fused multiply-add on x86.

> On Ubuntu 24.04 the GNU Compiler, gcc, produces the same result as x86
and does not use the fmadd instruction.
Be aware that corner case examples like this may change in future compiler versions.

hzeller · 2026-02-17T17:56:20Z

Maybe also some rounding or stdlib differences ?
I have #8317 open (needs some updating I guess), that showed slight differences as there was some confusion of calculations done with float/double). Maybe related.

github-actions · 2026-02-17T18:32:08Z

clang-tidy review says "All clean, LGTM! 👍"

fredowski · 2026-02-17T20:03:14Z

Thanks for contributing!

I think last time I checked, the BCR tcl version does not work fully as it fails to find its init.tcl and other files of its standard library:
$ bazel run @tcl_lang//:tclsh
...
initialization failed: Cannot find a usable init.tcl in the following directories: 

Thanks for looking into this build. I had these problems with missing tcl runtimes like init.tcl on MacOS but i fixed them. The runtime files are adressed in

Thanks for contributing!

I think last time I checked, the BCR tcl version does not work fully as it fails to find its init.tcl and other files of its standard library:

Maybe the issues you see in the failures can be explained with that ? (also CC: @QuantamHD )

Thanks for looking at my patch! I had the missing runtime files like init.tcl also in my build. The runtimes are provided with the changes in bazel/BUILD with the data section and bazel/InitRunFiles.cpp. I looked at some of the tests and with something missing like init.tcl more or less all tests fail.

fredowski · 2026-02-17T20:37:41Z

I was a little surprised about the regression differences. Reading up on x86/arm floating point differences I found this: https://learn.arm.com/learning-paths/cross-platform/floating-point-behavior.
> Understand that Arm and x86 produce identical results for all well-defined floating-point operations.
> Recognize that differences only occur in special undefined cases permitted by IEEE 754.
The lowest hanging fruit to make arm behave like x86 is: > Disable fused multiply-add operations using the -ffp-contract=off compiler flag.

I work in parallel on building OpenROAD with bazel on debian 13 in an arm64 virtual machine on my Macbook. There the regression fails for two tests "//src/psm/test:gcd_all_vss-tcl_test" and "//src/psm/test:gcd_em_test_vdd-tcl_test" out of 1155. So there are less tests failing and something must be different.

fredowski · 2026-02-18T08:42:12Z

I now used the bazel-macos branch to build on debian 13 aarch64 in a virtual machine on my macbook. Without bazel-macos changes, two tests fail on linux. With the changes from bazel-macos that affect all builds like the tcl from bazel repository the same two tests are failing on linux. So it must be something macos specific and not the parts that affect all builds that result in regression differences. The failed builds on debian 13 are:

//src/utl/test:utl_cfile_unittest                                        PASSED in 0.1s
//src/psm/test:gcd_all_vss-tcl_test                                      FAILED in 0.4s
  /home/caeuser/.cache/bazel/_bazel_caeuser/b94be67085162c57f50af2aa14f5f63d/execroot/_main/bazel-out/aarch64-opt/testlogs/src/psm/test/gcd_all_vss-tcl_test/test.log
//src/psm/test:gcd_em_test_vdd-tcl_test                                  FAILED in 0.4s
  /home/caeuser/.cache/bazel/_bazel_caeuser/b94be67085162c57f50af2aa14f5f63d/execroot/_main/bazel-out/aarch64-opt/testlogs/src/psm/test/gcd_em_test_vdd-tcl_test/test.log

Executed 1157 out of 1157 tests: 1155 tests pass and 2 fail locally.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
caeuser@debian:~/OpenROAD$  

gcd_all_vss: 
Differences found at line 3.
metal1,10.1650,15.4000,metal1,11.4350,15.4000,1.701e-17
metal1,10.1650,15.4000,metal1,11.4350,15.4000,1.700e-17
[INFO PSM-0015] Reading location of sources from: Vsrc_gcd_vss.loc.

gcd_em_test_vdd:
Differences found at line 2.
metal1,10.1650,11.2000,metal1,11.4350,11.2000,2.411e-13
metal1,10.1650,11.2000,metal1,11.4350,11.2000,1.119e-13
Exitcode:  0

stefanottili · 2026-02-18T10:46:51Z

since macOS is using clang, could you try this ?

Disable fused multiply-add operations using the -ffp-contract=off compiler flag.

hzeller · 2026-02-18T12:57:38Z

@@ -55,7 +55,7 @@ class BazelInitializer
    }

    // Set the TCL_LIBRARY environment variable
-    const std::string tcl_path = runfiles->Rlocation("tk_tcl/library/");
+    const std::string tcl_path = runfiles->Rlocation("tcl_lang+/library/");


the + here looks like an artifact to match how bazel 8 happens to name the directories.

I tested locally and the canonical tcl_lang/library/ also seems to work, so that is probably more robust to use.

hzeller · 2026-02-18T13:32:59Z

Thanks, yes the fix in the runfile section in bazel/InitRunFiles.cpp looks good; I have one suggestion to remove the + in the path, see above.

The PR also touches submodules (third-party/abc/ and src/sta/), but these changes should be done in the corresponding repositories (I am not sure if they can even be changed in the PR here).

Some changes (like the MacOS refinements) are somewhat independent of the tcl change.
So I suggest the following to break this into multiple PRs, that then are easier to manage and merge

The following is a suggestion (I am not owner of this project, so I can only recommend, @maliberty is the person to ask) how to split this into smaller changes:

Change to tcl in BCR

In this first change, add the tcl from BCR, but under the old name tk_tcl using the repo_name feature:

bazel_dep(name = "tcl_lang", version = "8.6.16.bcr.1", repo_name="tk_tcl")

, remove the tk_tcl and net_zlib in WORKSPACE and add the data dependency in bazel/BUILD for the runfile stuff to work.
With these three files (and the now changed MODULE.bazel.lock) we can switch to the new tcl from BCR without having to rename all the references in all BUILD files yet.

Change the `OpenSTA` and `abc` repos

Do the changes to rename @tk_tcl to @tcl_lang in the submodule repositories with PRs there
https://github.com/The-OpenROAD-Project/OpenSTA
https://github.com/The-OpenROAD-Project/abc (technically, we don't need this anymore as we also pull this from BCR since recently, but I guess it would make sense for completeness reasons).

Create a PR updating the submodules and finally rename all tcl deps

Update the submodules in the main OpenROAD project (that now pull in the renamed @tk_tcl in) and do the rename of @tk_tcl to @tcl_lang in all BUILD files in the OpenROAD main repo.
Remove repo_name="tk_tcl" from the MODULE.bazel.
Also update bazel/InitRunFiles.cpp to refer to tcl_lang/.

Now we have a Tcl using from BCR under its BCR-given name.

Separate PR for MacOS fixes

You also have a bunch of changes for MacOS, these make sense in a separate change. Easier to review and we can focus on the regression test issues there.

fredowski · 2026-02-18T19:06:15Z

I made a new PR #9490 which only replaces the tcl library from rules_hdl to the bcr version. This needs changes in the OpenSTA submodule. The PR there is The-OpenROAD-Project/OpenSTA#290.

I will adapt the PR here with the MacOS specific changes.

fredowski · 2026-02-18T19:12:10Z

since macOS is using clang, could you try this ?

Disable fused multiply-add operations using the -ffp-contract=off compiler flag.

I tried that but it made no difference. I have the same 61 failing regression test.

fredowski · 2026-02-20T10:00:57Z

@stefanottili - Here is some analysis why MacOS and Linux builds produce different results which result in regression test failures. With the fixes openroad produces identical results on MacOS and Linux. Linux only tested on debian trixie aarch64. This is from claude code.

Platform Compatibility: macOS vs Linux Regression Test Determinism

Overview

When running OpenROAD regression tests on macOS (arm64) with a hermetic LLVM
toolchain, numerous tests fail that pass on Linux — even when both platforms use
the same CPU architecture (aarch64). The golden reference files (.ok,
.defok) are generated on Linux x86_64. This document describes the root
causes and the fixes applied to achieve cross-platform determinism.

Test Environment

	Linux	macOS
Architecture	aarch64 (Debian 13)	arm64 (macOS 15)
Toolchain	Hermetic LLVM 20.1.8	Hermetic LLVM 20.1.8
C library	glibc	Apple libSystem
Build system	Bazel (bzlmod)	Bazel (bzlmod)

Results Before and After Fixes

	Before fixes	After fixes
macOS failures	65	53
Linux failures	52	52
macOS-only failures	13	1 (unrelated socket test)

The remaining 52 shared failures are due to the golden files being generated on
x86_64 Linux. These require golden file regeneration on aarch64 and are not
platform-compatibility issues.

Root Causes

1. System Math Library (libm) Differences

Impact: ~44 GPL tests, 2 PSM tests, 3 RSZ tests, 2 GRT tests, 1 PDN test

The hermetic LLVM toolchain provides the compiler but links against the
system C library: glibc on Linux, Apple's libSystem on macOS.
Transcendental math functions (sin, cos, exp, log, pow, sqrt,
atan) return results differing at the last bit of precision (ULP-level)
between the two implementations.

These tiny differences are amplified through:

FFT in global placement (src/gpl/src/fftsg.cpp) — sin()/cos()
differ at ULP level, producing different density gradients, leading to
different cell coordinates.
STA delay calculation (src/sta/dcalc/DmpCeff.cc) — custom exp2()
approximation and Newton-Raphson root finding amplify input differences.
Liberty table interpolation (src/sta/liberty/TableModel.cc) — bilinear
interpolation amplifies small input differences.
Cascading algorithmic decisions — a tiny timing difference can flip
optimization decisions, producing completely different results.

Fix: Integrate OpenLibm (v0.8.7),
a portable math library from the Julia project, as a static library with
alwayslink = True. This forces all OpenLibm symbols to be loaded into the
binary, overriding system libm functions.

Files:

MODULE.bazel — added http_archive for OpenLibm
bazel/openlibm/bundled.BUILD.bazel — BUILD file with platform-specific
source selection (x86_64 ld80, Linux aarch64 ld128, macOS aarch64 base-only)
BUILD.bazel — added @openlibm dependency to the openroad binary
.bazelrc — added --copt "-fno-builtin" globally to prevent the compiler
from replacing math calls with platform-specific intrinsics

2. Apple's `__sincosf_stret` Optimization

Impact: same tests as root cause 1

On Apple targets, Clang merges adjacent sinf(x) + cosf(x) calls into a
single call to __sincosf_stret, an Apple-specific combined sin+cos function
in libSystem. This bypasses OpenLibm entirely.

The -fno-builtin flag does not prevent this optimization. Flags like
-fno-builtin-sincosf, -fno-builtin-sincos, and
-fno-builtin-__sincosf_stret were all tested and none worked.

Fix: Add -fmath-errno to macOS build configuration. This flag tells the
compiler that math functions may set errno, preventing the merger of sin()
and cos() into a single call (since each call might independently set
errno).

Files:

.bazelrc — added build:macos --copt=-fmath-errno --host_copt=-fmath-errno

3. `rand()` Producing Different Sequences

Impact: 1 GPL clustering test (clust02), contributed to other GPL
placement differences

The C standard library rand() function produces completely different
sequences on glibc vs Apple's libc for the same seed:

Seed	glibc first values	Apple libc first values
42	71876166, 708592740, ...	705894, 1126542223, ...
1	1804289383, 846930886, ...	16807, 282475249, ...

glibc uses a TYPE_3 trinomial feedback shift register with 31 words of
state, while Apple uses a simpler algorithm. This affects:

ABC logic synthesis (64+ source files use rand()) — different random
simulation patterns in fraig, different optimization paths
MBFF clustering (src/gpl/src/mbff.cpp:1326,1347) — std::rand() used
for K-Means++ initialization and cluster assignment
Nesterov placement (src/gpl/src/nesterovBase.cpp:1914) — random offsets
during cell placement

Fix: Two-part fix:

Portable rand()/srand() — a drop-in replacement implementing glibc's
exact TYPE_3 algorithm (31-word state, separation 3, 310 warmup iterations).
Compiled with alwayslink = True so the symbols override system rand in
the final binary.
Replace rand() with std::mt19937 in nesterovBase.cpp — the
Mersenne Twister is specified by the C++ standard and produces identical
sequences on all platforms.

Files:

bazel/openlibm/portable_rand.c — glibc TYPE_3 rand/srand/rand_r
bazel/openlibm/BUILD — cc_library with alwayslink = True
BUILD.bazel — added //bazel/openlibm:portable_rand dependency
src/gpl/src/nesterovBase.cpp — replaced srand(42); rand() with
std::mt19937 offsetRng(42)

4. `qsort()` Ordering Differences for Equal Elements

Impact: 8 RMP tests, 1 PDN test, 1 GPL clustering test, 1 MPL test

The C standard does not guarantee qsort() stability — when elements compare
equal, their relative order is implementation-defined. glibc uses a
merge sort (which is stable), while Apple's libc uses a different algorithm
that produces different orderings:

Input:  {5,0} {3,1} {3,2} {1,3} {4,4} {2,5} {3,6} {5,7} {1,8}
glibc:  (1,3) (1,8) (2,5) (3,1) (3,2) (3,6) (4,4) (5,0) (5,7)
Apple:  (1,3) (1,8) (2,5) (3,6) (3,1) (3,2) (4,4) (5,7) (5,0)
                          ^^^^^^^^^^^         ^^^^^^^^^^^

ABC has 90+ qsort() calls. Different orderings of equal elements lead to
different optimization paths in fraig, rewriting, and technology mapping,
producing different gate counts and circuit structures.

Fix: Provide a portable qsort() implementation matching glibc's merge
sort algorithm on macOS. This is compiled only on macOS (Linux already uses
glibc's qsort natively).

Files:

bazel/openlibm/portable_qsort.c — merge sort with insertion sort fallback
bazel/openlibm/BUILD — included via select({"@platforms//os:macos": ...})

5. `-ffp-contract=off` (Pre-existing)

Impact: FFT-based placement tests

FP contraction allows the compiler to fuse multiply-add operations into FMA
instructions, which can produce slightly different results. This flag was
already set in .bazelrc before this work.

Files:

.bazelrc:29 — build --cxxopt "-ffp-contract=off"

Summary of All Changes

File	Change
`MODULE.bazel`	Added OpenLibm `http_archive` (v0.8.7)
`BUILD.bazel`	Added `@openlibm` and `//bazel/openlibm:portable_rand` deps
`bazel/openlibm/bundled.BUILD.bazel`	Full BUILD for OpenLibm with platform selects
`bazel/openlibm/BUILD`	Package with `portable_rand` and `portable_qsort` targets
`bazel/openlibm/portable_rand.c`	glibc TYPE_3 rand/srand implementation
`bazel/openlibm/portable_qsort.c`	glibc merge-sort qsort (macOS only)
`.bazelrc`	`-fno-builtin` (global), `-fmath-errno` (macOS)
`src/gpl/src/nesterovBase.cpp`	`rand()` replaced with `std::mt19937(42)`

Remaining Failures

Shared Failures (52 tests, both macOS and Linux)

These fail on aarch64 because golden files were generated on x86_64 Linux. The
nesterovBase.cpp change from rand() to mt19937 also changes placement
results vs the x86_64 golden files. These require golden file regeneration.

GPL — Global Placement (45 tests)

Test	Type
`//src/gpl/test:ar01`	DEF comparison
`//src/gpl/test:ar01-tcl_test`	Log comparison
`//src/gpl/test:ar02`	DEF comparison
`//src/gpl/test:ar02-tcl_test`	Log comparison
`//src/gpl/test:cluster_place01-tcl_test`	Log comparison
`//src/gpl/test:convergence01`	DEF comparison
`//src/gpl/test:convergence01-tcl_test`	Log comparison
`//src/gpl/test:core01`	DEF comparison
`//src/gpl/test:core01-tcl_test`	Log comparison
`//src/gpl/test:diverge01-tcl_test`	Log comparison
`//src/gpl/test:error01-tcl_test`	Log comparison
`//src/gpl/test:incremental01`	DEF comparison
`//src/gpl/test:incremental01-tcl_test`	Log comparison
`//src/gpl/test:nograd01`	DEF comparison
`//src/gpl/test:nograd01-tcl_test`	Log comparison
`//src/gpl/test:simple01`	DEF comparison
`//src/gpl/test:simple01-obs`	DEF comparison
`//src/gpl/test:simple01-obs-tcl_test`	Log comparison
`//src/gpl/test:simple01-rd-tcl_test`	Log comparison
`//src/gpl/test:simple01-ref`	DEF comparison
`//src/gpl/test:simple01-ref-tcl_test`	Log comparison
`//src/gpl/test:simple01-skip-io`	DEF comparison
`//src/gpl/test:simple01-skip-io-tcl_test`	Log comparison
`//src/gpl/test:simple01-tcl_test`	Log comparison
`//src/gpl/test:simple01-td`	DEF comparison
`//src/gpl/test:simple01-td-tcl_test`	Log comparison
`//src/gpl/test:simple01-td-tune`	DEF comparison
`//src/gpl/test:simple01-td-tune-tcl_test`	Log comparison
`//src/gpl/test:simple01-uniform`	DEF comparison
`//src/gpl/test:simple01-uniform-tcl_test`	Log comparison
`//src/gpl/test:simple02`	DEF comparison
`//src/gpl/test:simple02-rd-tcl_test`	Log comparison
`//src/gpl/test:simple02-tcl_test`	Log comparison
`//src/gpl/test:simple03`	DEF comparison
`//src/gpl/test:simple03-rd-tcl_test`	Log comparison
`//src/gpl/test:simple03-tcl_test`	Log comparison
`//src/gpl/test:simple04`	DEF comparison
`//src/gpl/test:simple04-rd-tcl_test`	Log comparison
`//src/gpl/test:simple04-tcl_test`	Log comparison
`//src/gpl/test:simple07`	DEF comparison
`//src/gpl/test:simple07-tcl_test`	Log comparison
`//src/gpl/test:simple08`	DEF comparison
`//src/gpl/test:simple08-tcl_test`	Log comparison
`//src/gpl/test:simple10`	DEF comparison
`//src/gpl/test:simple10-tcl_test`	Log comparison

GRT — Global Routing (2 tests)

Test	Type
`//src/grt/test:congestion2-tcl_test`	Log comparison
`//src/grt/test:congestion5-tcl_test`	Log comparison

PSM — Power (2 tests)

Test	Type
`//src/psm/test:gcd_all_vss-tcl_test`	Log comparison
`//src/psm/test:gcd_em_test_vdd-tcl_test`	Log comparison

RSZ — Resizer (3 tests)

Test	Type
`//src/rsz/test:buffer_ports10-tcl_test`	Log comparison
`//src/rsz/test:buffer_ports8-tcl_test`	Log comparison
`//src/rsz/test:buffer_ports9-tcl_test`	Log comparison

macOS-Only Failure (1 test)

Test	Root Cause
`//src/utl/test:utl_cfile_unittest`	`Utl.metrics_server_responds_with_basic_metric` fails with "shutdown: Socket is not connected". This is a macOS socket behavior difference (not numerical).

Tests Fixed by This Work (previously macOS-only, now passing)

Test	Root Cause	Fix
`//src/gpl/test:clust02-tcl_test`	`rand()`, `qsort()`	portable_rand, portable_qsort
`//src/mpl/test:fixed_macros2-tcl_test`	libm, `qsort()`	OpenLibm, portable_qsort
`//src/pdn/test:pads_black_parrot_flipchip_connect_overpads-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:aes_annealing-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:aes_asap7-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:aes_dontuse_nangate45-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:aes_nangate45-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:const_cell_removal-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:gcd_annealing2-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:gcd_asap7-tcl_test`	`qsort()`	portable_qsort
`//src/rmp/test:gcd_restructure-tcl_test`	`qsort()`	portable_qsort

Why `-ffp-contract=off` Alone Was Not Sufficient

FP contraction (-ffp-contract) controls whether the compiler fuses
multiply-add operations into FMA instructions. While this is important for
reproducibility, it only affects compiler-level transformations. The
dominant sources of cross-platform divergence are:

C library function implementations — sin(), cos(), exp() etc.
return different results at ULP level between glibc and Apple libm. No
compiler flag changes this.
C library rand() algorithm — completely different PRNG algorithms
between glibc and Apple libc.
C library qsort() stability — different sort algorithms produce
different orderings for equal elements.

These require library-level fixes (OpenLibm, portable_rand,
portable_qsort), not compiler flags.

Architecture

                    ┌─────────────────────┐
                    │   openroad binary    │
                    └─────┬───────────────┘
                          │ deps
            ┌─────────────┼──────────────────┐
            │             │                  │
   ┌────────▼───────┐  ┌─▼──────────┐  ┌───▼────────────┐
   │   OpenLibm     │  │ portable   │  │  portable      │
   │  (all platforms)│  │ rand       │  │  qsort         │
   │                │  │ (all plat) │  │  (macOS only)  │
   │ sin,cos,exp,   │  │            │  │                │
   │ log,pow,sqrt.. │  │ rand()     │  │ qsort()       │
   │                │  │ srand()    │  │ qsort_r()     │
   │ alwayslink=True│  │ rand_r()   │  │               │
   └────────────────┘  │            │  │ alwayslink=   │
                       │ alwayslink=│  │ True          │
                       │ True       │  └───────────────┘
                       └────────────┘

All three libraries use alwayslink = True which forces the linker to load
all symbols from the static archive into the final binary, overriding the
corresponding system library functions.

hzeller · 2026-02-20T14:21:25Z

Thanks for your extensive investigation!

The changes from fftsg.cpp could also partially be due to unintentional conversions between float and double in that file (which I addressed in #8317 but it stalled as I couldn't run all tests. Maybe time to revisit). Differences in the ULP might be possible.

calewis · 2026-02-20T15:12:52Z

Thanks for this work! Quick question, we've talked about this issue before, although none of us have investigated it to the level you've done here 🎉, I worry that to achieve true portability (at least on Linux/unix) we need to remove all dependence on cmath, or do you think what you've done here is sufficient? Even if OpenRoad removes cmath as a dependency, it seems like it would be hard to do the same for ABC.

github-actions

clang-tidy made some suggestions

Added -undefined dynamic_lookup linkopts on macOS for Python extension .so targets. The reference symbols are resolved at load time not link time. Without this the linker errors on undefined Python symbols. The change affects only the macos build. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

Moved %include "Exception-py.i" before %template declarations so the Python exception handler is active when SWIG generates template wrapper code. Without this the python tests in the regression fail because the ask for tcl symbols. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

I rewrote the test to use pure tcl file reads and string comparison. Now the test also works on macos. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

I extended the mechanism that is used to supply the defines for the python module. Now tha linker options are set in one place. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

maliberty · 2026-02-26T07:46:49Z

That makes sense so I just rebased it and it looks to build now.

github-actions · 2026-02-26T07:49:06Z

clang-tidy review says "All clean, LGTM! 👍"

github-actions · 2026-02-26T07:51:27Z

clang-tidy review says "All clean, LGTM! 👍"

github-actions · 2026-02-26T08:00:05Z

clang-tidy review says "All clean, LGTM! 👍"

I already made the tcl changes in the third-party/abc module. They are commited in abc. So lets track them. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

github-actions · 2026-02-26T08:14:40Z

clang-tidy review says "All clean, LGTM! 👍"

fredowski · 2026-02-26T08:19:02Z

Sorry - I didn't see the automerge activity. I have rebased this and updated the third-party/abc submodule to track the abc tcl merge.

hzeller · 2026-02-26T10:42:41Z

currently, we're not using the third-party/abc but directly the one from BCR ... so you might not see any difference.

fredowski · 2026-02-26T12:33:13Z

currently, we're not using the third-party/abc but directly the one from BCR ... so you might not see any difference.

yes i noticed but i had made the changes before main branch switched to the abc bazel repo. So this is just in case somebody wants to build agains the local abc again.

stefanottili · 2026-02-27T16:30:36Z

@fredowski: Thanks for working on this. I gave it a try on a 16GB MacBook M1, MacOS Tahoe.

Compared to a cmake build:

it felt like it took a long time to build, I need to quantify that
used 13.4GB in /var/tmp and 16GB in ~/.cache compared to the 2GB in a build directory.

git pull https://github.com/The-OpenROAD-Project/OpenROAD.git --recurse-submodules
bazelisk run //:openroad
...
bazelisk test //src/...
...
Executed 1160 out of 1160 tests: 1098 tests pass and 62 fail locally.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

I tried to build the gui, but that failed. Is this the right way ?

bazelisk run --//:platform=gui //:openroad -- -gui

log.gz

hzeller · 2026-02-27T17:09:51Z

regarding the long build times

bazel builds all the dependencies as well to be fully hermetic, while the cmake version uses the binary libraries on the system.
Also, bazel distinguishes multiple builds internally (tool builds, shared libraries for tests, static libraries for binaries). The static and shared object builds can probably be folded together.

(I think the bazel gui build is not working yet on macos)

maliberty · 2026-02-27T22:39:01Z

I just merged the PR to enable qt6 on mac.

fredowski · 2026-02-28T00:09:06Z

@fredowski: Thanks for working on this. I gave it a try on a 16GB MacBook M1, MacOS Tahoe.

Compared to a cmake build:

it felt like it took a long time to build, I need to quantify that

With my Macbook Pro M4 i need 22minutes to compile after cleaning with

bazelisk clean && time bazelisk build --//:platform=gui //:openroad --disk_cache=

used 13.4GB in /var/tmp and 16GB in ~/.cache compared to the 2GB in a build directory.

I have also around 13 GB in /private/var/tmp but only 3,4G in ~/.cache/bazel-disk-cache

...
Executed 1160 out of 1160 tests: 1098 tests pass and 62 fail locally.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

Yes, this are the 62 tests which are failing initially.

I tried to build the gui, but that failed. Is this the right way ?
bazelisk run --//:platform=gui //:openroad -- -gui

This is correct. I should work after the #9568 is now merged.

QuantamHD · 2026-02-28T01:30:37Z

@stefanottili With a clean build you're paying for compiling all the dependencies of OpenROAD including QT from source, and statically linking them in the OpenROAD binary. On the next successive build you will only be paying for changes to the OpenROAD source code. Bazel is all all about incrementally building hermetically.

If you change a single cc file in the OR repo the recompile should be quick.

The other thing that's nice about compiling everything from source is that while it might take a while the first time it'll also pass the first time. So we're trading a bit of compile time upfront against the time it takes you to wrestle with apt-get/brew annoyance.

stefanottili · 2026-02-28T04:26:24Z

@QuantamHD I take your points, but let me play devils advocate for a sec here:

a) In 20 years of compiling klayout with qt4/5/6 on rhel linux and 4 years of doing that and building OR on macOS:
not once did the thought "geez, I wish I would have to compile qt from source" cross my mind.
apt-get/brew installed a precompiled version and I never had to waste a single CPU cycle or any disk space to build it.

Any package manager/build system has its quirks but over time things tend to get fixed.
I've been building OR on Mac with cmake/brew after all.
And some things just work and keep on working: e.g. klayout is still using qmake.

b) /opt/brew 9GB + 2GB cmake build dir vs 18GB .cache + 11GB /private/var/tmp/_bazel...
18GB might not be a large difference to you, but my disk is getting full, so I noticed.
And compared to the 2GB cmake build dir or the 1GB klayout build, it stands out.

@fredowski
bazelisk clean && time bazelisk build --//:platform=gui //:openroad --disk_cache=
took 48min on a M1 MacBook

Many thanks for getting OR to compile with qt6, it would be nice if cmake + brew qt@5 and qt@6 would work.
I'm holding my breath for the platform independent regression test fixes.

maliberty · 2026-02-28T04:41:58Z

qt is using a prebuilt library, no? (https://github.com/The-OpenROAD-Project/qt_bazel_prebuilts)

QuantamHD · 2026-02-28T05:05:09Z

That repo is named that, but it is fact building from source now. There's no way to reliably ship compatible binaries on linux so I just built it from source. Which mirrors what we do at Google.

maliberty · 2026-02-28T05:10:23Z

Perhaps we will need to have a public mac bazel cache too then.

QuantamHD · 2026-02-28T05:18:24Z

@QuantamHD I take your points, but let me play devils advocate for a sec here:

a) In 20 years of compiling klayout with qt4/5/6 on rhel linux and 4 years of doing that and building OR on macOS: not once did the thought "geez, I wish I would have to compile qt from source" cross my mind. apt-get/brew installed a precompiled version and I never had to waste a single CPU cycle or any disk space to build it.

Any package manager/build system has its quirks but over time things tend to get fixed. I've been building OR on Mac with cmake/brew after all. And some things just work and keep on working: e.g. klayout is still using qmake.

b) /opt/brew 9GB + 2GB cmake build dir vs 18GB .cache + 11GB /private/var/tmp/_bazel... 18GB might not be a large difference to you, but my disk is getting full, so I noticed. And compared to the 2GB cmake build dir or the 1GB klayout build, it stands out.

@fredowski bazelisk clean && time bazelisk build --//:platform=gui //:openroad --disk_cache= took 48min on a M1 MacBook

Many thanks for getting OR to compile with qt6, it would be nice if cmake + brew qt@5 and qt@6 would work. I'm holding my breath for the platform independent regression test fixes.

@stefanottili I get it, I do. But let me try to give you the background on why we're going this route.

Every week or so I see a number of people who file an issue that they were not able to compile OpenROAD on their machine for one reason or another (You personally have filed ~13 such issues over the years ), and for every one of the people who files an issue, I assume a 100 or more just give up. User adoption is a funnel, and building OpenROAD is the widest part of that funnel, which means making it more reliable is the best way to increase the number of user conversions.

Bazel does take longer the first time, and it does take more disk, but what it aims to be more than anything is reliable. I want it to work the first every time for every users on the planet no matter what weird linux or macos system they're on, and unfortunately the best way to ensure that is to make everything compile from source using a hermetic compiler toolchain.

Some of the build time issues as Matt suggests can be offset by populating a public cache that bazel can pull from as is already done on linux platforms.

gemini-code-assist Bot reviewed Feb 16, 2026

View reviewed changes

This was referenced Feb 16, 2026

bazel/macos: migrate from rules_hdl based tcl to bazel repository The-OpenROAD-Project/OpenSTA#286

Closed

bazel/macos: migrate zlib and readline from rules_hdl to bazel repo The-OpenROAD-Project/abc#9

Merged

maliberty requested a review from QuantamHD February 16, 2026 21:16

fredowski force-pushed the bazel-macos branch 2 times, most recently from 65ed7ae to 5a7349e Compare February 16, 2026 22:31

hzeller reviewed Feb 18, 2026

View reviewed changes

fredowski force-pushed the bazel-macos branch from 5a7349e to f6157c0 Compare February 18, 2026 17:38

fredowski changed the title ~~bazel macos - migrate tcl from rules_hdl to bazel BCR, fix linker problems, swig and regression for macos~~ (NOT READY) bazel/macos - fix linker problems, swig and regression for macos Feb 18, 2026

fredowski force-pushed the bazel-macos branch from f6157c0 to 00943ad Compare February 18, 2026 20:33

fredowski force-pushed the bazel-macos branch 2 times, most recently from 2cdcb25 to a443776 Compare February 20, 2026 14:16

github-actions Bot reviewed Feb 20, 2026

View reviewed changes

Comment thread bazel/InitRunFiles.cpp

fredowski added 5 commits February 26, 2026 08:04

macos: fixed odb regression test which used md5sum (Linux-only)

43850d6

I rewrote the test to use pure tcl file reads and string comparison. Now the test also works on macos. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

bazel/macos: fix shell script to also work on macos

6b9c9c9

Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

bazel/macos: central setting for linker settings for python extensions

6d77389

I extended the mechanism that is used to supply the defines for the python module. Now tha linker options are set in one place. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

maliberty approved these changes Feb 26, 2026

View reviewed changes

maliberty enabled auto-merge February 26, 2026 07:49

auto-merge was automatically disabled February 26, 2026 07:57
Head branch was pushed to by a user without write access

fredowski force-pushed the bazel-macos branch from 632cb2f to 6d77389 Compare February 26, 2026 07:57

bazel/macos: update third-party/abc submodule

561b8d5

I already made the tcl changes in the third-party/abc module. They are commited in abc. So lets track them. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>

maliberty merged commit 14472b8 into The-OpenROAD-Project:master Feb 26, 2026
13 checks passed

fredowski mentioned this pull request Feb 28, 2026

Regression testing / Reproducability: Fix MacOS and Linux differences for rand() and qsort() #9574

Closed

Conversation

fredowski commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

maliberty commented Feb 16, 2026

Uh oh!

github-actions Bot commented Feb 16, 2026

Uh oh!

fredowski commented Feb 16, 2026

Uh oh!

hzeller commented Feb 17, 2026

Uh oh!

stefanottili commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hzeller commented Feb 17, 2026

Uh oh!

github-actions Bot commented Feb 17, 2026

Uh oh!

fredowski commented Feb 17, 2026

Uh oh!

fredowski commented Feb 17, 2026

Uh oh!

fredowski commented Feb 18, 2026

Uh oh!

stefanottili commented Feb 18, 2026

Uh oh!

hzeller Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

hzeller commented Feb 18, 2026

Change to tcl in BCR

Change the OpenSTA and abc repos

Create a PR updating the submodules and finally rename all tcl deps

Separate PR for MacOS fixes

Uh oh!

fredowski commented Feb 18, 2026

Uh oh!

fredowski commented Feb 18, 2026

Uh oh!

fredowski commented Feb 20, 2026

Platform Compatibility: macOS vs Linux Regression Test Determinism

Overview

Test Environment

Results Before and After Fixes

Root Causes

1. System Math Library (libm) Differences

2. Apple's __sincosf_stret Optimization

3. rand() Producing Different Sequences

4. qsort() Ordering Differences for Equal Elements

5. -ffp-contract=off (Pre-existing)

Summary of All Changes

Remaining Failures

Shared Failures (52 tests, both macOS and Linux)

GPL — Global Placement (45 tests)

GRT — Global Routing (2 tests)

PSM — Power (2 tests)

RSZ — Resizer (3 tests)

macOS-Only Failure (1 test)

Tests Fixed by This Work (previously macOS-only, now passing)

Why -ffp-contract=off Alone Was Not Sufficient

Architecture

Uh oh!

hzeller commented Feb 20, 2026

Uh oh!

calewis commented Feb 20, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maliberty commented Feb 26, 2026

Uh oh!

fredowski commented Feb 16, 2026 •

edited

Loading

stefanottili commented Feb 17, 2026 •

edited

Loading

Change the `OpenSTA` and `abc` repos

2. Apple's `__sincosf_stret` Optimization

3. `rand()` Producing Different Sequences

4. `qsort()` Ordering Differences for Equal Elements

5. `-ffp-contract=off` (Pre-existing)

Why `-ffp-contract=off` Alone Was Not Sufficient

stefanottili commented Feb 27, 2026 •

edited

Loading

QuantamHD commented Feb 28, 2026 •

edited

Loading

stefanottili commented Feb 28, 2026 •

edited

Loading

QuantamHD commented Feb 28, 2026 •

edited

Loading