Skip to content

Dynamic build support with CI matrix#45

Merged
angerman merged 7 commits intostable-ghc-9.14from
ci-dynamic-matrix-and-rename
Feb 14, 2026
Merged

Dynamic build support with CI matrix#45
angerman merged 7 commits intostable-ghc-9.14from
ci-dynamic-matrix-and-rename

Conversation

@angerman
Copy link
Copy Markdown

@angerman angerman commented Sep 4, 2025

Summary

This PR adds comprehensive dynamic build support to the stable-haskell GHC build, building on the foundation from PR #137.

Changes

CI/Build System:

  • Add DYNAMIC=0/1 build/test matrix for static and dynamic builds
  • Add basename symlinks for nested shared libraries (.dylib, .so) to make them discoverable by the linker
  • Create ghc-iserv-dyn symlink so GHC can find the dynamic interpreter
  • Use concrete file target for testsuite-timeout to avoid unnecessary rebuilds

RTS/Linker:

  • Improve RTS sublibrary support with -no-ghc-internal flag
  • Fix dynamic linking for RTS sublibraries (rpath injection, RTLD_GLOBAL for ghc-internal)
  • Dynamically compute RTS rpath dependencies instead of hardcoded list
  • Export RTS symbols from ghc-iserv for dynamic builds
  • Fix libffi-clib integration for dynamic builds

Testsuite:

  • Improve testlib.py robustness with better error handling
  • Adjust tests for RTS split

Based on

Test plan

  • CI passes for both DYNAMIC=0 and DYNAMIC=1 builds
  • Testsuite passes on all platforms

@angerman angerman requested a review from hasufell September 4, 2025 00:38
@angerman angerman self-assigned this Sep 4, 2025
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch from 4de99d0 to 23dd6fa Compare September 4, 2025 00:50
@angerman
Copy link
Copy Markdown
Author

angerman commented Sep 4, 2025

Interestingly, this now seems to fail due to the missing rts sublib issues. That are fixed in #44. Specifically this:

#44 (comment)

@angerman
Copy link
Copy Markdown
Author

angerman commented Sep 4, 2025

Fixes #51 and #52

@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch 3 times, most recently from dabc558 to bbb815a Compare September 4, 2025 12:46
@angerman angerman force-pushed the stable-ghc-9.14 branch 4 times, most recently from d9f06c2 to a1edbb5 Compare September 5, 2025 07:32
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch 2 times, most recently from 7c8212f to e74ec54 Compare September 11, 2025 06:05
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch from b78b210 to 1967f08 Compare September 17, 2025 03:20
@angerman angerman changed the base branch from stable-ghc-9.14 to stable-ghc-9.14-rebased September 17, 2025 08:28
Comment on lines +263 to +274
sanitized_hc := $(subst $(space),_,$(subst :,_,$(subst /,_,$(subst \,_,$(TEST_HC)))))
test_hc_hash := $(shell \
if command -v openssl >/dev/null 2>&1; then \
openssl dgst -sha256 $(TEST_HC) | awk '{print substr($$2, 1, 8)}'; \
elif command -v sha256sum >/dev/null 2>&1; then \
sha256sum $(TEST_HC) | awk '{print substr($$1, 1, 8)}'; \
elif command -v shasum >/dev/null 2>&1; then \
shasum -a 256 $(TEST_HC) | awk '{print substr($$1, 1, 8)}'; \
else \
echo "no_hash"; \
fi)
ghc_config_mk = $(TOP)/mk/$(test_hc_hash)_ghcconfig$(sanitized_hc).mk
Copy link
Copy Markdown
Author

@angerman angerman Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add a comment here why we do this. It adds a hash to the ghc config file. This ensure we always read the correct ghcconfig, if the ghc changed, the hash changed, and we recompute.

Comment thread testsuite/driver/testlib.py Outdated
Comment thread rts/rts.cabal
(NoArg (setGeneralFlag Opt_NoHsMain))
, make_ord_flag defGhcFlag "no-rts"
(NoArg (setGeneralFlag Opt_NoRts))
, make_ord_flag defGhcFlag "no-ghc-internal"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed now?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the following dependeny as visible to cabal:

ghc-internal
+ rts

however, ghc will try to insert:

ghc-internal
+ rts-sublib (based on the -threaded / -debug flag)
  + rts

If we try to build the rts-sublib with ghc, we can end up trying to load ghc-internal, due to auto-injection of libraries.

Maybe a better solution is to add a flag to outright disable ghc's auto population of libs, instead of having separate ones for each lib 😅

@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch from 8c7a84c to 3e7c948 Compare November 27, 2025 06:18
@angerman angerman changed the base branch from stable-ghc-9.14-rebased to stable-ghc-9.14.2025.11.12 November 27, 2025 06:20
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch from 2536c91 to d16343c Compare November 28, 2025 01:36
@angerman angerman force-pushed the stable-ghc-9.14.2025.11.12 branch 2 times, most recently from e808dde to 537b3a7 Compare November 28, 2025 08:40
@angerman angerman deleted the branch stable-ghc-9.14 November 29, 2025 02:16
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch 4 times, most recently from a6aa186 to 1fe3235 Compare February 5, 2026 04:47
@hasufell
Copy link
Copy Markdown
Member

hasufell commented Feb 5, 2026

On ARM64 alpine musl linked:

2026-02-05T06:41:54.1525196Z +++ ghci/linking/dyn/T10458.run/T10458.run.stderr.normalised	2026-02-05 05:56:23.588129109 +0000
2026-02-05T06:41:54.1525572Z @@ -0,0 +1,11 @@
2026-02-05T06:41:54.1525710Z +
2026-02-05T06:41:54.1525937Z +GHC.Linker.Loader.dynLoadObjs: Loading temp shared object failed
2026-02-05T06:41:54.1526334Z +During interactive linking, GHCi couldn't find the following symbol:
2026-02-05T06:41:54.1526878Z +  Error loading shared library libAS.so: No such file or directory (needed by /tmp/ghc65945_tmp_9_0/libghc_tmp_10.so)
2026-02-05T06:41:54.1527408Z +This may be due to you not asking GHCi to load extra object files,
2026-02-05T06:41:54.1527802Z +archives or DLLs needed by your current session.  Restart GHCi, specifying
2026-02-05T06:41:54.1528228Z +the missing library using the -L/path/to/object/dir and -lmissinglibname
2026-02-05T06:41:54.1528641Z +flags, or simply by naming the relevant files on the GHCi command line.
2026-02-05T06:41:54.1529026Z +Alternatively, this link failure might indicate a bug in GHCi.
2026-02-05T06:41:54.1529380Z +If you suspect the latter, please report this as a GHC bug:
2026-02-05T06:41:54.1529698Z +  https://github.com/stable-haskell/ghc/issues

@angerman
Copy link
Copy Markdown
Author

angerman commented Feb 5, 2026

On ARM64 alpine musl linked:

2026-02-05T06:41:54.1525196Z +++ ghci/linking/dyn/T10458.run/T10458.run.stderr.normalised	2026-02-05 05:56:23.588129109 +0000
2026-02-05T06:41:54.1525572Z @@ -0,0 +1,11 @@
2026-02-05T06:41:54.1525710Z +
2026-02-05T06:41:54.1525937Z +GHC.Linker.Loader.dynLoadObjs: Loading temp shared object failed
2026-02-05T06:41:54.1526334Z +During interactive linking, GHCi couldn't find the following symbol:
2026-02-05T06:41:54.1526878Z +  Error loading shared library libAS.so: No such file or directory (needed by /tmp/ghc65945_tmp_9_0/libghc_tmp_10.so)
2026-02-05T06:41:54.1527408Z +This may be due to you not asking GHCi to load extra object files,
2026-02-05T06:41:54.1527802Z +archives or DLLs needed by your current session.  Restart GHCi, specifying
2026-02-05T06:41:54.1528228Z +the missing library using the -L/path/to/object/dir and -lmissinglibname
2026-02-05T06:41:54.1528641Z +flags, or simply by naming the relevant files on the GHCi command line.
2026-02-05T06:41:54.1529026Z +Alternatively, this link failure might indicate a bug in GHCi.
2026-02-05T06:41:54.1529380Z +If you suspect the latter, please report this as a GHC bug:
2026-02-05T06:41:54.1529698Z +  https://github.com/stable-haskell/ghc/issues

I'll look at this, why are we running dyn on static musl/alpine? That's odd, but probably something we should either prohibit by construction or fix.

@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch 4 times, most recently from 3ae8429 to 9cfb121 Compare February 7, 2026 03:56
Comment thread Makefile
Comment on lines +124 to +125
PATCHELF ?= patchelf
INSTALL_NAME_TOOL ?= install_name_tool
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we really need these still?

Comment thread Makefile
Comment on lines +641 to +643
ifeq ($(DYNAMIC),1)
$(SED) -i -e 's/"RTS ways","/"RTS ways","dyn debug_dyn thr_dyn thr_debug_dyn /' $@
endif
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuck. this is so stupid.

Comment thread Makefile
# $1 = TIPLET
define build_cross
GHC=$(GHC) HADRIAN_SETTINGS='$(call HADRIAN_SETTINGS)' \
LD_LIBRARY_PATH=$(LD_LIBRARY_PATH) GHC=$(GHC) HADRIAN_SETTINGS='$(call HADRIAN_SETTINGS)' \
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for all these LD_LIBRARY_PATH stuff, feels more like a hack. Something we ought to fix properly.

Comment thread Makefile
fi ; \
done ; \
if [ $${ffi_incdir} != "none" ] ; then $(call copy_headers,ffitarget.h,$(CURDIR)/$${ffi_incdir},libffi-clib,$(CURDIR)/_build/bindist/bin/$1-ghc-pkg$(EXE_EXT)) ; fi
if [ $${ffi_incdir} != "none" ] ; then $(call copy_headers,ffitarget.h,$(CURDIR)/$${ffi_incdir},libffi-clib,LD_LIBRARY_PATH=$(LD_LIBRARY_PATH) $(CURDIR)/_build/bindist/bin/$1-ghc-pkg$(EXE_EXT)) ; fi
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is fixed in cabal now, the copy-headers stuff?

Comment thread Makefile
Comment on lines +942 to +954
# $1 = rpath
# $2 = binary
# set rpath relative to the current executable
# TODO: on darwin, this doesn't overwrite rpath, but just adds to it,
# so we'll have the old rpaths from the build host in there as well
# set_rpath: Add rpath to binary. On Darwin, check if rpath already exists
# before adding (install_name_tool fails if rpath is duplicate).
define set_rpath
$(if $(filter Darwin,$(UNAME)), \
if ! otool -l "$(2)" 2>/dev/null | grep -A2 'LC_RPATH' | grep -q "@executable_path/$(1)"; then \
$(INSTALL_NAME_TOOL) -add_rpath "@executable_path/$(1)" "$(2)"; \
fi, \
$(PATCHELF) --force-rpath --set-rpath "\$$ORIGIN/$(1)" "$(2)")
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this feels like a massive hack, we shouldn't need and that GHC should just do on it's own properly.

Comment thread utils/ghc-iserv/cbits/iservmain.c Outdated
Comment on lines +7 to +16
/*
* Note [Boot library symbol visibility]
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Boot library promotion to RTLD_GLOBAL is now handled in RTS initialization.
* See Note [Promoting Boot Libraries to RTLD_GLOBAL] in rts/RtsStartup.c.
*
* This ensures all GHC API programs (not just ghc-iserv) properly export
* boot library symbols before any user code is loaded via dlopen.
*/

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is necessary anymore 🤔

@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch 2 times, most recently from 4b6d197 to 5f99a8c Compare February 11, 2026 04:24
Comment thread libraries/ghci/GHCi/InfoTable.hsc Outdated
-- addresses directly from the running RTS. This is called at runtime after
-- promoteBootLibrariesToGlobal() has run, ensuring we get the correct RTS.
--
-- The lookup is cached in an IORef for efficiency.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need an IORef: a CAF would be enough. On first entry it would perform the unsafePerformIO stuff.

-- | RTS function to get stg_interp_constr*_entry address.
-- This returns the address directly from the running RTS, avoiding any
-- symbol resolution issues from library loading order.
foreign import ccall unsafe "getInterpConstrEntryAddr"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rts/includes/stg/MiscClosures.h there is this comment:

// RTS_FUN(stg_interp_constr1_entry);
// RTS_FUN(stg_interp_constr2_entry);
// RTS_FUN(stg_interp_constr3_entry);
// RTS_FUN(stg_interp_constr4_entry);
// RTS_FUN(stg_interp_constr5_entry);
// RTS_FUN(stg_interp_constr6_entry);
// RTS_FUN(stg_interp_constr7_entry);
//
// This is referenced using the FFI in the compiler (GHC.ByteCode.InfoTable),
// so we can't give it the correct type here because the prototypes
// would clash (FFI references are always declared with type StgWord[]
// in the generated C code). 

You should now be allowed to declare the correct types here and to remove the comment I think.

Comment thread rts/linker/Elf.c Outdated
Comment on lines +57 to +79
* fit in a 32-bit slot. When a reference to a symbol in a shared library
* exceeds this range, we use "jump islands" (trampolines) for code references
* via makeSymbolExtra().
*
* We unfortunately can't tell whether symbol references are to code
* or data. So for now we assume they are code (the vast majority
* are), and allocate jump-table slots. Unfortunately this will
* SILENTLY generate crashing code for data references. This hack is
* enabled by X86_64_ELF_NONPIC_HACK.
* For FUNCTION symbols (STT_FUNC), jump islands work correctly: the call
* bounces through the trampoline to reach the real function.
*
* One workaround is to use shared Haskell libraries. This is the case
* when dynamically-linked GHCi is used.
* For DATA symbols (STT_OBJECT, STT_NOTYPE), jump islands are INCORRECT:
* the jump island address would be embedded as a data pointer (e.g., an info
* table pointer in a closure), causing the GC to interpret jump island memory
* as a valid info table — leading to "strange closure type" crashes.
*
* Another workaround is to keep the static libraries but compile them
* with -fPIC -fexternal-dynamic-refs, because that will generate PIC
* references to data which can be relocated. This is the case when
* +RTS -xp is passed.
* We detect data references using ELF_ST_TYPE(sym.st_info) from the symbol
* table. When a non-function symbol overflows 32-bit range, we emit a clear
* error message instead of silently creating a corrupt jump island.
*
* Workarounds for data reference overflow:
* - Use shared Haskell libraries (dynamically-linked GHCi)
* - Compile with -fPIC -fexternal-dynamic-refs (generates GOT-based
* relocations that can handle arbitrary distances)
* - Use +RTS -xp (maps loaded objects into low memory)
*
* This hack is enabled by X86_64_ELF_NONPIC_HACK.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* fit in a 32-bit slot. When a reference to a symbol in a shared library
* exceeds this range, we use "jump islands" (trampolines) for code references
* via makeSymbolExtra().
*
* We unfortunately can't tell whether symbol references are to code
* or data. So for now we assume they are code (the vast majority
* are), and allocate jump-table slots. Unfortunately this will
* SILENTLY generate crashing code for data references. This hack is
* enabled by X86_64_ELF_NONPIC_HACK.
* For FUNCTION symbols (STT_FUNC), jump islands work correctly: the call
* bounces through the trampoline to reach the real function.
*
* One workaround is to use shared Haskell libraries. This is the case
* when dynamically-linked GHCi is used.
* For DATA symbols (STT_OBJECT, STT_NOTYPE), jump islands are INCORRECT:
* the jump island address would be embedded as a data pointer (e.g., an info
* table pointer in a closure), causing the GC to interpret jump island memory
* as a valid info tableleading to "strange closure type" crashes.
*
* Another workaround is to keep the static libraries but compile them
* with -fPIC -fexternal-dynamic-refs, because that will generate PIC
* references to data which can be relocated. This is the case when
* +RTS -xp is passed.
* We detect data references using ELF_ST_TYPE(sym.st_info) from the symbol
* table. When a non-function symbol overflows 32-bit range, we emit a clear
* error message instead of silently creating a corrupt jump island.
*
* Workarounds for data reference overflow:
* - Use shared Haskell libraries (dynamically-linked GHCi)
* - Compile with -fPIC -fexternal-dynamic-refs (generates GOT-based
* relocations that can handle arbitrary distances)
* - Use +RTS -xp (maps loaded objects into low memory)
*
* This hack is enabled by X86_64_ELF_NONPIC_HACK.
* fit in a 32-bit slot. When a reference to a symbol in a shared library
* exceeds this range, we can use a hack (enabled by X86_64_ELF_NONPIC_HACK)
* which consists in generating "jump islands" (trampolines) for code references
* (this happens in makeSymbolExtra()).
*
* For FUNCTION symbols (STT_FUNC), jump islands work correctly: the call
* bounces through the trampoline to reach the real function.
*
* For DATA symbols (STT_OBJECT, STT_NOTYPE), jump islands are INCORRECT:
* the jump island address would be embedded as a data pointer (e.g., an info
* table pointer in a closure), causing the GC to interpret jump island memory
* as a valid info tableleading to "strange closure type" crashes.
*
* We detect data references using ELF_ST_TYPE(sym.st_info) from the symbol
* table. When a non-function symbol overflows 32-bit range, we emit a clear
* error message instead of silently creating a corrupt jump island.
*
* Workarounds for data reference overflow:
* - Use shared Haskell libraries (dynamically-linked GHCi)
* - Compile with -fPIC -fexternal-dynamic-refs (generates GOT-based
* relocations that can handle arbitrary distances)
* - Use +RTS -xp (maps loaded objects into low memory)
*

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good!

Comment thread rts/linker/Elf.c
{
StgInt64 off = value - P;
if (off != (Elf64_Sword)off && X86_64_ELF_NONPIC_HACK) {
/* Check symbol type: jump islands only work for code (functions).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! We should probably upstream this.

Is it an issue for tables-next-to-code where a symbol is both data and code (entry code)? Shouldn't we use two symbols instead of one to trigger this error when appropriate?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just remember debugging this and scratching my head wtf was going on. Hence this at least gives a bit clearer error message. Two symbols might be nice, although it would lead to even more symbol explosion. Windows also struggles with this data/code issue massively. Ultimately, I'm not sure TNTC is a feature worth keeping. (it also inflates object sizes quite a bit by putting a lot of redundant data into the objects).

Comment thread rts/rts.cabal
ghc-options: -this-unit-id rts -ghcversion-file=include/ghcversion.h -optc-DFS_NAMESPACE=rts
cmm-options: -this-unit-id rts

-- [The AutoApply story]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sub-libraries are such a kludge. I'm not convinced they are worth all the added complexity

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to other solutions that play nice with cabal. I'm not super happy about the sublibs, but they seem to be the least worst of the pack.

Comment on lines +666 to +678
# Wrap path resolution to avoid passing None/invalid paths to Path APIs.
def current(_way):
p = path_func()
if p is None:
raise StatsException("No path returned for size collection")
# If p looks absolute, use it directly; else resolve relative to testdir
pth = Path(p)
if not pth.is_absolute():
pth = in_testdir(p)
if not pth.exists():
raise StatsException(f"Path not found for size collection: {pth}")
return os.path.getsize(pth)
return collect_generic_stat ( 'size', deviation, current )
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember why it failed? We need some motivation to upstream it. It looks reasonable but a concrete example would help.

return collect_size_func(deviation, lambda: find_non_inplace_so(library))
try:
return collect_size_func(deviation, lambda: find_non_inplace_so(library))
except Exception as _:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess the code above is to handle this case. Maybe not good for upstreaming then.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Making it more robust might be a good idea (and reduce the stable-haskell ghc/upstream drift).

Comment on lines +52 to +60
-- Export RTS symbols to dynamically loaded libraries
-- See Note [ghc-iserv and dynamic symbol export]
if os(linux) || os(freebsd)
ghc-options: -rdynamic
if os(osx) || os(darwin)
ghc-options: -optl -Wl,-flat_namespace
-- Note: Windows has a hard limit of 65535 symbol exports (16-bit index).
-- We cannot use --export-all-symbols here as we exceed that limit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Darwin/OSX -flat_namespace option should be passed in GHC.Runtime.Interpreter.C too.

(we already pass -Wl,--export-dynamic on ELF targets there)

Comment thread Makefile
BINDIST_EXECTUABLES := \
ghc$(EXE_EXT) \
ghc-iserv$(EXE_EXT) \
ghc-iserv-dyn$(EXE_EXT) \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to distribute iserv programs now that GHC can build them on demand?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now yes. In the future I hope not.

if os(osx) || os(darwin)
ghc-options: -optl -Wl,-flat_namespace
-- Note: Windows has a hard limit of 65535 symbol exports (16-bit index).
-- We cannot use --export-all-symbols here as we exceed that limit.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that on Windows the dynamic build doesn't work?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it has ever? And yes, this is a stupid limitation. But the issue is more about how many symbols we actually try to have visible.

This commit fixes several critical issues with the RTS object linker that
prevented dynamic GHC builds from loading code correctly.

Key fixes:

1. Detect data vs code references in X86_64_ELF_NONPIC_HACK (Elf.c)
   - The jump island mechanism was incorrectly applied to data references
   - For info table pointers (_con_info symbols), embedding a jump island
     address caused GC crashes ("strange closure type")
   - Now distinguishes R_X86_64_PLT32 (code) from R_X86_64_PC32 (data)
   - Data references use GOT-style indirection through extra->addr instead

2. Preserve dlerror for linker script fallback handling (LoadNativeObjPosix.c)
   - dlerror() clears after first call, losing error context
   - Now saves error string before retry logic
   - Fixes misleading error messages when loading fails

3. Promote boot libraries to RTLD_GLOBAL for dynamic code loading (RtsStartup.c)
   - Boot libraries loaded with RTLD_LOCAL weren't visible to dlsym
   - Dynamic object loading failed to resolve symbols from boot libs
   - Now re-opens boot libraries with RTLD_GLOBAL flag at startup

4. Skip loading libc/libm already linked into process (LoadNativeObjPosix.c)
   - Avoids redundant loading and symbol conflicts
   - Checks if library is already resident before calling dlopen

5. Dynamic lookup of stg_interp_constr entry points via dlsym (RtsSymbols.c)
   - Interpreter constructor symbols need runtime resolution
   - Adds dynamic fallback when static symbols unavailable

6. Remove residual debug instrumentation
   - Cleans up debugging code from Evac.c and LoadNativeObjPosix.c
This commit adds infrastructure for RTS sublibrary loading in dynamic builds,
enabling the split RTS architecture to work with shared library linking.

Key changes:

1. RTS sublibrary infrastructure (rts/rts.cabal)
   - Define separate sublibraries for RTS components
   - Add proper library dependencies and visibility
   - Configure shared library generation for RTS parts

2. Configure support for dynamic builds (rts/configure.ac)
   - Detect platform-specific dynamic linking requirements
   - Set appropriate linker flags for each sublibrary
   - Handle symbol visibility for exported functions

3. API updates for sublibrary boundaries (rts/include/RtsAPI.h)
   - Adjust exported symbol declarations
   - Ensure proper visibility across sublibrary boundaries

4. AutoApply support for interpreter (rts/AutoApply*.cmm)
   - Add AutoApply.cmm and vector variants (V16, V32, V64)
   - Required for dynamic bytecode interpreter operation

5. Cabal project configuration
   - cabal.project.stage1: Add no-ghc-internal flag for stage1 builds
   - cabal.project.stage2: Configure full RTS with all sublibraries

6. Thread infrastructure (rts/Threads.h)
   - Updates for sublibrary thread handling
This commit addresses compiler and driver issues specific to dynamic GHC
builds, ensuring proper code generation and linking behavior.

Key changes:

1. Make Opt_ExternalDynamicRefs default on all PIC platforms (DynFlags.hs)
   - Previously only enabled for specific configurations
   - Dynamic builds require external dynamic references for proper GOT usage
   - Prevents relocation issues with large code models

2. Pipeline and session handling updates
   - Driver/Pipeline.hs: Handle dynamic linking in compilation pipeline
   - Driver/Session.hs: Session configuration for dynamic builds
   - Driver/Flags.hs: Flag handling for dynamic mode

3. Linker updates for dynamic mode
   - Linker/Executable.hs: Executable linking for dynamic builds
   - Linker/Static.hs: Static linking coordination
   - ByteCode/Linker.hs: Bytecode linker for dynamic interpreter

4. Unit state and GHCi support
   - Unit/State.hs: Package database handling for dynamic libs
   - GHCi/InfoTable.hsc: Info table generation for dynamic mode
   - Tc/Gen/Splice.hs: Template Haskell splice handling
GHC and ghc-iserv load Haskell shared libraries dynamically for Template
Haskell and GHCi. These libraries reference RTS symbols (e.g.,
stg_INTLIKE_closure) that are linked into the executable. Without special
linker flags, those symbols aren't visible to dlopen'd libraries.

This commit adds platform-specific linker flags to export these symbols:

- Linux/FreeBSD: -rdynamic (passes --export-dynamic to ld)
- macOS: -flat_namespace (makes all symbols visible across namespaces)
- Windows: Cannot use --export-all-symbols due to 65535 symbol limit

See Note [ghc-iserv and dynamic symbol export] in ghc-iserv.cabal.in
for detailed explanation of the approach and alternatives considered.
This commit adds build system support for creating dynamic GHC builds,
including Makefile targets, bindist generation, and utility configurations.

Key changes:

1. Makefile enhancements
   - Add DYNAMIC=1 build variable support
   - Create dylib symlinks for macOS dynamic builds
   - Use concrete file target for testsuite-timeout
   - Include ghc-iserv-dyn in tarballs for all targets
   - Proper bindist generation for dynamic builds

2. Utility cabal files (hp2ps.cabal, unlit.cabal)
   - Configure for dynamic linking support
   - Ensure utilities work with dynamic GHC

3. ghc-iserv infrastructure (iservmain.c)
   - Updates for dynamic interpreter server
   - Proper initialization for dynamic linking context

4. Test expectations for Stable Haskell
   - Update bug report URL in test expectations

Usage:
  make DYNAMIC=1 _build/bindist  # Build dynamic GHC bindist
This commit updates the testsuite to handle the split RTS architecture
and dynamic GHC build configuration.

Key changes:

1. testlib.py improvements
   - More robust test driver for dynamic builds
   - Better handling of shared library paths
   - Improved error detection and reporting

2. Test infrastructure (boilerplate.mk)
   - Configure tests for dynamic linking environment
   - Set proper library paths for test execution

3. Test adjustments for RTS split
   - T18072debug: Update grep to match cabal-based RTS naming
   - T23142.hs: Revert module name to fix -Di debug output test
   - keep-cafs-fail.stdout: Update expected output

4. Dynamic linking test updates
   - ghci/linking/dyn/all.T: Adjust for dynamic GHC
   - T2228: Restore expect_broken(7298) for dynamic builds
   - T11531.stderr: Update expected error messages

5. Platform-specific adjustments
   - T10458: Skip on musl with dynamic GHC
   - T11223 tests: Update stderr expectations for Windows

6. Test configuration
   - .gitignore: Add patterns for dynamic test artifacts
   - dynlibs/Makefile: Update for dynamic build testing
   - perf/size/all.T: Adjust size expectations
This commit extends the CI/CD pipeline to build and test dynamic GHC
configurations alongside the existing static builds.

Key changes:

1. ci.yml - Main CI workflow
   - Add DYNAMIC=1 to build matrix
   - Configure dynamic build jobs for Linux and macOS
   - Run ghci-ext tests on dynamic builds (require interpreter)
   - Parallel execution of static and dynamic builds

2. reusable-release.yml - Release workflow
   - Add dynamic GHC builds to release artifacts
   - Generate separate bindists for dynamic configuration
   - Include ghc-iserv-dyn in release tarballs
   - Re-enable release workflow on pull requests for testing

The dynamic build matrix allows testing of:
- Template Haskell with dynamic code loading
- GHCi interactive features
- Dynamic library loading and linking
- Interpreter-based test suites (ghci-ext)

Build configurations:
- Static (default): DYNAMIC=0 or unset
- Dynamic: DYNAMIC=1
@angerman angerman force-pushed the ci-dynamic-matrix-and-rename branch from 5f99a8c to 0be09ea Compare February 13, 2026 07:55
@angerman angerman merged commit 967ae7b into stable-ghc-9.14 Feb 14, 2026
33 checks passed
@angerman angerman deleted the ci-dynamic-matrix-and-rename branch February 14, 2026 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants