Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Changelog

## Unreleased
* Support baking `PythonCall` into a juliacall system image via
`PythonCall._is_embedded[] = true` in a PackageCompiler `script=`.
* Added option `lib` to JuliaCall. Setting this will skip the discovery subprocess.
* Bug fixes.

Expand Down
49 changes: 49 additions & 0 deletions docs/src/juliacall.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,55 @@ systems that may be readonly. Note that the project set in
`PYTHON_JULIACALL_PROJECT` *must* already have PythonCall.jl installed and it
*must* match the JuliaCall version, otherwise loading Julia will fail.

### [Baking PythonCall into a system image](@id baking-sysimage)

The first `import juliacall` in a fresh process is slow - typically 10-20
seconds in a clean container - because Julia starts, deserialises
`PythonCall` from cache, and JIT-compiles the bridge's hot paths. Long-
running processes amortise that cost. Short-lived ones - serverless
functions, queue workers, CI jobs that start, handle one request, and
exit - pay it on every invocation.

Compiling `PythonCall` into a system image with
[PackageCompiler.jl](https://github.com/JuliaLang/PackageCompiler.jl)
collapses load+compile into a memory-map at startup, typically cutting
that cost by an order of magnitude. To bake the resulting image so
`import juliacall` picks it up automatically, set
`PythonCall._is_embedded[] = true` inside the sysimage-build process.

PackageCompiler's `precompile_execution_file=` is run in a separate child
process whose state is not snapshotted, so the flag must be set via the
`script=` keyword instead.

```julia
# bake_embedded.jl
PythonCall._is_embedded[] = true
```

```julia
using PackageCompiler
create_sysimage(["PythonCall"];
sysimage_path = "myapp.so",
script = "bake_embedded.jl",
project = ".",
)
```

Pass `precompile_execution_file=` alongside `script=` to also bake your own
hot code paths into the image.

At runtime, point juliacall at the resulting sysimage via
[`PYTHON_JULIACALL_SYSIMAGE`](@ref julia-config), and set the
[`lib`](@ref pythoncall-config) preference / `JULIA_PYTHONCALL_LIB` to the
path of the host's libpython - the embedded path needs an explicit handle
to libpython since the bridge does not load the interpreter itself.

#### Subprocess behaviour

If a julia process without a running Python interpreter loads a sysimage
baked with `_is_embedded[] = true` (for example a `Base.compilecache`
child), `PythonCall` loads as inactive - no error, no Python state.

## [Configuration](@id julia-config)

Some features of the Julia process, such as the optimization level or number of threads, may
Expand Down
3 changes: 2 additions & 1 deletion src/C/C.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ if @load_preference("exe", "@CondaPkg") == "@CondaPkg"
end

import ..PythonCall:
python_executable_path, python_library_path, python_library_handle, python_version
python_executable_path, python_library_path, python_library_handle, python_version,
_is_embedded

include("consts.jl")
include("pointers.jl")
Expand Down
102 changes: 89 additions & 13 deletions src/C/context.jl
Original file line number Diff line number Diff line change
Expand Up @@ -105,21 +105,57 @@ on_main_thread

function init_context()

CTX.is_embedded = hasproperty(Base.Main, :__PythonCall_libptr)
# Embedded if juliacall set Main.__PythonCall_libptr or the sysimage baked
# `_is_embedded[]` to `true`.
has_libptr = hasproperty(Base.Main, :__PythonCall_libptr)
CTX.is_embedded = has_libptr || _is_embedded[]

if CTX.is_embedded
# In this case, getting a handle to libpython is easy
CTX.lib_ptr = Base.Main.__PythonCall_libptr::Ptr{Cvoid}
init_pointers()
# Check Python is initialized
Py_IsInitialized() == 0 && error("Python is not already initialized.")
CTX.is_initialized = true
CTX.which = :embedded
exe_path = Utils.getpref_exe()
if exe_path != ""
CTX.exe_path = exe_path
# this ensures PyCall uses the same Python interpreter
get!(ENV, "PYTHON", exe_path)
# Locate libpython.
if has_libptr
CTX.lib_ptr = Base.Main.__PythonCall_libptr::Ptr{Cvoid}
else
lib_path = Utils.getpref_lib()
if lib_path !== nothing
lib_ptr = dlopen_e(lib_path, CTX.dlopen_flags)
if lib_ptr != C_NULL
CTX.lib_path = lib_path
CTX.lib_ptr = lib_ptr
end
end
end

embedded_ok = false
if CTX.lib_ptr != C_NULL
init_pointers()
embedded_ok = Py_IsInitialized() != 0
end

if embedded_ok
CTX.is_initialized = true
CTX.which = :embedded
exe_pref = Utils.getpref_exe()
if exe_pref != ""
CTX.exe_path = exe_pref
get!(ENV, "PYTHON", exe_pref)
else
exe_path = _embedded_program_path()
if exe_path !== nothing
CTX.exe_path = exe_path
get!(ENV, "PYTHON", exe_path)
end
end
elseif has_libptr
error("PythonCall is in embedded mode but no Python interpreter is running in this process.")
else
# Either the `lib` preference is unset, or Python is not running
# in this process (e.g. a julia.exe child of `Base.compilecache`
# loaded a sysimage baked for the embedded path). Leave PythonCall
# inactive instead of erroring.
CTX.is_embedded = false
CTX.lib_ptr = C_NULL
CTX.lib_path = missing
return
end
else
# Find Python executable
Expand Down Expand Up @@ -322,6 +358,46 @@ function init_context()
return
end

# Return `sys.executable` as a String, or nothing. Requires init_pointers().
function _embedded_program_path()
import_mod = dlsym_e(CTX.lib_ptr, :PyImport_ImportModule)
getattr = dlsym_e(CTX.lib_ptr, :PyObject_GetAttrString)
asutf8 = dlsym_e(CTX.lib_ptr, :PyUnicode_AsUTF8AndSize)
decref = dlsym_e(CTX.lib_ptr, :Py_DecRef)
errclear = dlsym_e(CTX.lib_ptr, :PyErr_Clear)
(import_mod == C_NULL || getattr == C_NULL || asutf8 == C_NULL ||
decref == C_NULL || errclear == C_NULL) && return nothing

sys_mod = ccall(import_mod, Ptr{Cvoid}, (Ptr{Cchar},), "sys")
if sys_mod == C_NULL
ccall(errclear, Cvoid, ())
return nothing
end
result = nothing
try
exec_obj = ccall(getattr, Ptr{Cvoid}, (Ptr{Cvoid}, Ptr{Cchar}), sys_mod, "executable")
if exec_obj == C_NULL
ccall(errclear, Cvoid, ())
return nothing
end
try
size_ref = Ref{Cssize_t}(0)
cstr = ccall(asutf8, Ptr{Cchar}, (Ptr{Cvoid}, Ref{Cssize_t}), exec_obj, size_ref)
if cstr == C_NULL
ccall(errclear, Cvoid, ())
return nothing
end
size_ref[] == 0 && return nothing
result = unsafe_string(cstr, size_ref[])
finally
ccall(decref, Cvoid, (Ptr{Cvoid},), exec_obj)
end
finally
ccall(decref, Cvoid, (Ptr{Cvoid},), sys_mod)
end
return result
end

function Base.show(io::IO, ::MIME"text/plain", ctx::Context)
show(io, typeof(io))
print(io, ":")
Expand Down
1 change: 1 addition & 0 deletions src/Compat/Compat.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ include("serialization.jl")
include("tables.jl")

function __init__()
C.CTX.is_initialized || return
init_gui()
init_pyshow()
end
Expand Down
1 change: 1 addition & 0 deletions src/Convert/Convert.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ include("numpy.jl")
include("pandas.jl")

function __init__()
C.CTX.is_initialized || return
init_pyconvert()
init_ctypes()
init_numpy()
Expand Down
3 changes: 3 additions & 0 deletions src/Core/Core.jl
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,9 @@ include("juliacall.jl")
include("pyconst_macro.jl")

function __init__()
# Skip if C bailed out (e.g. a julia.exe child of Base.compilecache
# loaded a sysimage baked for the embedded path).
C.CTX.is_initialized || return
init_consts()
init_datetime()
init_stdlib()
Expand Down
1 change: 1 addition & 0 deletions src/JlWrap/C.jl
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,7 @@ function init_c()
end

function __init__()
C.CTX.is_initialized || return
init_c()
end

Expand Down
1 change: 1 addition & 0 deletions src/JlWrap/JlWrap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ include("set.jl")
include("callback.jl")

function __init__()
C.CTX.is_initialized || return
init_base()
init_raw()
init_any()
Expand Down
8 changes: 8 additions & 0 deletions src/PythonCall.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@ module PythonCall

const ROOT_DIR = dirname(@__DIR__)

"""
PythonCall._is_embedded
Marks the running sysimage as embedded in a Python host. Set to `true` in a
PackageCompiler `script=` to bake the embedded path into the sysimage.
"""
const _is_embedded = Ref(false)

include("API/API.jl")
include("Utils/Utils.jl")
include("NumpyDates/NumpyDates.jl")
Expand Down
1 change: 1 addition & 0 deletions src/Wrap/Wrap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ include("PyTable.jl")
include("PyPandasDataFrame.jl")

function __init__()
C.CTX.is_initialized || return
priority = PYCONVERT_PRIORITY_ARRAY
pyconvert_add_rule("<arraystruct>", PyArray, pyconvert_rule_array_nocopy, priority)
pyconvert_add_rule("<arrayinterface>", PyArray, pyconvert_rule_array_nocopy, priority)
Expand Down
Loading