Skip to content

Build dynamic GHC in github CI#137

Closed
hasufell wants to merge 19 commits intostable-ghc-9.14from
stable-ghc-9.14-dynamic
Closed

Build dynamic GHC in github CI#137
hasufell wants to merge 19 commits intostable-ghc-9.14from
stable-ghc-9.14-dynamic

Conversation

@hasufell
Copy link
Copy Markdown
Member

No description provided.

@hasufell hasufell force-pushed the stable-ghc-9.14-dynamic branch 2 times, most recently from 9886269 to 3004a5d Compare January 15, 2026 05:19
@hasufell hasufell force-pushed the stable-ghc-9.14-dynamic branch from 8235630 to ca591b6 Compare January 15, 2026 10:10
@angerman angerman mentioned this pull request Jan 19, 2026
2 tasks
@angerman
Copy link
Copy Markdown

Of note: from #45 work, the compiler conceptually worked. The issue was mostly getting the test-suite on linux green. On darwin linking was working well, on linux it wasn't as we were missing the library paths. I tried really hard to not rely on the LD_LIBRARY_PATH/LD_PRELOAD solution. Ideally GHC would set these correctly. I remember writing about this some where (slack, discord?) at length, but it's been month ago, I should have written this here :(

@hasufell hasufell force-pushed the stable-ghc-9.14-dynamic branch 7 times, most recently from 89052eb to cd43088 Compare January 21, 2026 10:07
hasufell and others added 9 commits January 27, 2026 13:26
- Add -no-ghc-internal flag to prevent auto-injection of ghc-internal
  when building RTS sublibraries (mirrors existing -no-rts flag). This
  prevents circular dependency issues during RTS build.
- Move CMM sources into common rts-cmm-sources-base stanza shared by
  all sublibraries (threaded, debug, etc.)
- Implement AutoApply.cmm.h workaround: generate .cmm.h files in main
  library, include them via wrapper .cmm files in sublibraries. This
  ensures each sublibrary gets properly parameterized CMM code.
- Double all build flags (ghc-options, cpp-options, cmm-options,
  cc-options) in rts.cabal to ensure they're definitely propagated
- Remove auto-link from hp2ps and unlit (C-only utilities that don't
  need Haskell runtime linking)
Add libffi-clib package configuration to suppress C compiler warnings
that are promoted to errors by default. The bundled libffi code produces
warnings that would otherwise fail the build.

This is needed because:
- GHC uses bundled libffi for FFI support
- The libffi-clib wrapper exposes libffi to the RTS
- Upstream libffi code triggers compiler warnings
- With -Werror (often set by default), these become fatal errors
This commit fixes several dynamic linking issues that arise when using
GHC with RTS sublibraries and the GHC API:

- Inject rpath for RTS and libffi-clib in dynamic builds. This ensures
  the dynamic linker can find these libraries at runtime, especially
  when cabal passes -dyload deploy.

- Promote ghc-internal to RTLD_GLOBAL when loaded via dlopen. This
  prevents duplicate symbol errors when multiple shared libraries
  reference the same ghc-internal symbols.

- Export RTS symbols from ghc-iserv for dynamic builds. Programs using
  the GHC API load shared libraries via dlopen() that reference RTS
  symbols like stg_INTLIKE_closure.

- Apply -rdynamic unconditionally for GHC API programs (Linux/FreeBSD)
  and -flat_namespace for macOS. This makes RTS symbols visible to
  dynamically loaded libraries even when the main executable wasn't
  compiled with -dynamic.

See Note [Export dynamic symbols for GHC API programs] in Linker/Static.hs
and Note [ghc-iserv and dynamic symbol export] in ghc-iserv.cabal.in.
Improve error handling and path resolution in the test driver:

- Add proper error handling for missing directories and files when
  searching for shared objects, raising StatsException with clear
  error messages instead of silently failing
- Fix path resolution in collect_size_func to handle both absolute
  and relative paths correctly
- Improve ghc-pkg output parsing to handle various output formats
- Add fallback logic for finding shared objects: try inplace first,
  fall back to non-inplace if that fails
- Convert silent failures to explicit StatsException raises so test
  failures are properly reported
Adjust the test suite for the RTS sublibrary split:

- Prefix ghcconfig filename with hash of TEST_HC binary to ensure we
  recompute the config when the compiler changes. This prevents stale
  config values when switching between different GHC versions.

- Disable rts test which is invalid since the RTS split (the test
  assumes monolithic RTS structure)

- Mark T2228 as not broken (it was incorrectly marked)

- Add testsuite-specific .gitignore entries
Replace hardcoded ["rts", "libffi-clib"] list with a function that
dynamically computes which packages need rpath injection by checking:
1. Any package named "rts" (covers all RTS sublibraries)
2. Any direct dependency of an RTS package

This is more robust as it will automatically handle any future
library dependencies the RTS might gain.

Also adds Note [RTS sublibrary rpath injection] explaining why GHC
must always inject rpaths for RTS-related libraries regardless of
Cabal's -dynload deploy setting - Cabal fundamentally cannot see
the RTS sublibrary selection which happens at GHC link time.
Static.hs imported GHC.Linker.ExtraObj which doesn't exist in the
stable-ghc-9.14 base branch. The required functions are already
available in GHC.Linker.Executable with ExecutableLinkOpts-based API.

- Export mkExtraObjToLinkIntoBinary and mkNoteObjsToLinkIntoBinary from Executable.hs
- Update Static.hs to import from Executable instead of ExtraObj
- Convert DynFlags to ExecutableLinkOpts using initExecutableLinkOpts
angerman and others added 8 commits January 27, 2026 13:58
The linker module APIs differ between stable-ghc-9.14 and the rebased
branch. Update to use the correct function signatures:

- maybeCreateManifest: use initManifestOpts dflags instead of dflags
- initLinkerConfig: takes only DynFlags (no require_cxx parameter)
- runLink: pass require_cxx as 4th argument
- runInjectRPaths: use configureOtool/configureInstallName instead of toolSettings
- runRanlib: use configureRanlib dflags instead of dflags

Import the required config functions from GHC.SysTools.Tasks.
On Darwin, install_name_tool -add_rpath fails if the rpath already exists.
When building with DYNAMIC=1, the GHC linker already injects rpaths during
linking (via runInjectRPaths), so binaries already have the @executable_path
rpath when the bindist target tries to add it again.

Fix by checking if the rpath already exists before attempting to add it.
On Unix systems, fundamental libraries like libc, libm, pthread, dl, and rt
are always linked into any process at startup. When GHCi tries to load these
via dlopen (e.g., because ghc-internal has "extra-libraries: c m"), it can
cause problems on NixOS where gcc may find a different version than the one
the interpreter is linked against.

Loading multiple copies of libc causes memory corruption and "strange closure
type" GC crashes. The fix adds isAlwaysLinkedLib check in load_dyn to skip
loading these fundamental system libraries, as they're always available.

This fixes ghci-ext test failures on DYNAMIC=1 builds with NixOS.
The dlerror() function can only be called once per error - subsequent
calls return NULL. Debug logging was consuming dlerror() before it
could be saved for the linker script fallback handler.

On Linux, system libraries like libc.so and libm.so are often GNU ld
linker scripts rather than actual ELF files. The RTS has code in
loadNativeObjFromLinkerScript_ELF() to handle this case by parsing
the linker script and extracting the real library path (e.g.,
libc.so -> libc.so.6).

However, this fallback requires the error message from dlerror() which
contains the filename. When debug logging called dlerror() first, the
error message was lost and the linker script handler received NULL,
causing it to fail silently.

Fix by saving dlerror() immediately after dlopen fails, before any
debug logging. This allows the linker script fallback to work correctly
on systems (like NixOS) where libc.so is a linker script.

Also disables non-Nix CI PR triggers in release.yml to save resources.
The RTLD_NOLOAD check was in a preprocessor guard that occurred before
including <dlfcn.h>, so it was never defined and the promoteBootLibrariesToGlobal
function was being compiled out entirely.

This fixes the ghci-ext test failures on Linux with DYNAMIC=1 builds.
On some Linux/glibc configurations, RTLD_NOLOAD requires _GNU_SOURCE
to be defined. This must be done before any headers are included.
This ensures promoteBootLibrariesToGlobal() is compiled on Linux.
@angerman
Copy link
Copy Markdown

Superceeded by #45

@angerman angerman closed this Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants