Skip to content

Conversation

@ColmTalbot
Copy link
Collaborator

@ColmTalbot ColmTalbot commented Jan 7, 2025

I've been working on this PR on and off for a few months, it isn't ready yet, but I wanted to share it in case other people had early opinions.

The goal is to make it easier to interface with models/samplers implemented in e.g., JAX, that support GPU/TPU acceleration and JIT compilation.

The general guiding principles are:

  • when possible maintain existing behaviour with numpy/builtin arguments
  • work introspectively so users don't need to specify the target backend, but use input types
  • write as little backend specific code as possible, mostly through using the array-api specification and scipy interoperability

The primary changes so far are:

  • making most priors backend independent, there are a few holdouts where the underlying scipy functionality isn't compatible yet
  • core likelihoods mostly work with data from any backend
  • GW likelihoods work with any backend supported by the source function
  • the GW detector objects don't work via introspection, they need to be manually set
  • GW geometry (currently in bilby_cython) is handled via multiple-dispatch and added back into bilby

Changed behaviour:

Remaining issues:

  • Saving/loading nun-numpy arrays in result files may not work
  • I added some additional parameter conversions that I will remove
  • the bilby.gw.jaxstuff file should be removed and relevant functionality be moved elsewhere, it's currently just used for testing
  • the ROQ likelihood hasn't been ported
  • add more testing with JAX
  • translate some of the hyperparameter functionality, c.f., GWPopulation

@ColmTalbot ColmTalbot added the enhancement New feature or request label Jan 7, 2025
@ColmTalbot ColmTalbot marked this pull request as draft January 7, 2025 19:38
@ColmTalbot ColmTalbot force-pushed the bilback branch 2 times, most recently from ea348fa to 771a8a9 Compare January 22, 2026 17:00
@ColmTalbot ColmTalbot marked this pull request as ready for review January 23, 2026 15:24
@ColmTalbot ColmTalbot changed the title DRAFT: Support non-numpy array backends Support non-numpy array backends Jan 23, 2026
@ColmTalbot ColmTalbot added >100 lines refactoring to discuss To be discussed on an upcoming call labels Jan 23, 2026
@ColmTalbot
Copy link
Collaborator Author

This is now ready for review.
There are some things that won't work with JAX at the moment, e.g., various combinations of likelihood marginalization/acceleration.
I think we should accept this at the moment, for at least a bilby v3 alpha/beta release, and keep chipping away at the various subcases over time.

There are a lot of changes, but most of them are essentially np -> xp.
Some things required refactoring to avoid modifying slices of arrays as JAX doesn't like that.

Bilby can once again be installed without bilby.cython.
This should improve our general portability, but when bilby_cython is installed it will be used.

I've managed to keep test changes minimal:

  • I updated the joint prior test to make it more stringent (keys more randomly ordered).
  • I refactored some expensive prior initialization that was dramatically slowing things down.
  • I improved the logic for figuring out when ROQs are available to help my local testing.
  • Some mocks of numpy had to be updated.

@mj-will mj-will added this to the 3.0.0 milestone Jan 27, 2026
Copy link
Collaborator

@mj-will mj-will left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments but I'll need to have another look.

elif aac.is_cupy_namespace(xp):
from cupyx.scipy.special import erfinv
else:
raise BackendNotImplementedError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to include the backend in the error.

__all__ = ["array_module", "promote_to_array"]


def array_module(arr):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would benefit from a doc-string

return np


def promote_to_array(args, backend, skip=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you thought about how devices would be handled here?

Moving arrays to a from GPUs can sometimes require more than just calling array.

return np


def promote_to_array(args, backend, skip=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a doc-string

import os

import numpy as np
os.environ["SCIPY_ARRAY_API"] = "1" # noqa # flag for scipy backend switching
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry slightly about having this hard coded. Does it introduce more overhead when using just numpy?

This maps to the inverse CDF. This has been analytically solved for this case.
"""
return gammaincinv(self.k, val) * self.theta
return xp.asarray(gammaincinv(self.k, val)) * self.theta
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean this is falling back to numpy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I should update/recheck this, but at least jax doesn't have good support for this, but it looks like tensorflow has a version that numpyro uses (jax-ml/jax#5350). cupy does have this function, so this workaround may have just been for jax. I could add a BackendNotImplementedError.

)
)

betaln,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not anything good.

Suggested change
betaln,

Comment on lines +852 to +853
# return self.check_ln_prob(sample, ln_prob,
# normalized=normalized)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the removal of this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure it was, but I'll double check. I think check_ln_prob was problematic in some way.

Comment on lines 877 to 902
self[key].least_recently_sampled = result[key]
if isinstance(self[key], JointPrior) and self[key].dist.distname not in joint:
joint[self[key].dist.distname] = [key]
elif isinstance(self[key], JointPrior):
joint[self[key].dist.distname].append(key)
for names in joint.values():
# this is needed to unpack how joint prior rescaling works
# as an example of a joint prior over {a, b, c, d} we might
# get the following based on the order within the joint prior
# {a: [], b: [], c: [1, 2, 3, 4], d: []}
# -> [1, 2, 3, 4]
# -> {a: 1, b: 2, c: 3, d: 4}
values = list()
for key in names:
values = np.concatenate([values, result[key]])
for key, value in zip(names, values):
result[key] = value

def safe_flatten(value):
"""
this is gross but can be removed whenever we switch to returning
arrays, flatten converts 0-d arrays to 1-d so has to be special
cased
"""
if isinstance(value, (float, int)):
return value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is removing this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is in line with one of the other open PRs to update this logic. I'll dig it out in my next pass.

Comment on lines +250 to +251
# delta_x = ifos[0].geometry.vertex - ifos[1].geometry.vertex
# theta, phi = zenith_azimuth_to_theta_phi(zenith, azimuth, delta_x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest we remove this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# delta_x = ifos[0].geometry.vertex - ifos[1].geometry.vertex
# theta, phi = zenith_azimuth_to_theta_phi(zenith, azimuth, delta_x)

Copy link
Collaborator Author

@ColmTalbot ColmTalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the initial comments @mj-will I'll take a pass at them ASAP.

)
)

betaln,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not anything good.

Suggested change
betaln,

"""
at_peak = (val == self.peak)
return np.nan_to_num(np.multiply(at_peak, np.inf))
return at_peak * 1.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah

(xp.sin(val) - xp.sin(self.minimum)) /
(xp.sin(self.maximum) - xp.sin(self.minimum))
)
_cdf *= val >= self.minimum
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of in-place operation works, it's just operations on slices that don't work. Things like patterns that sometimes existed

_cdf = ...
_cdf[val < self.minimum] = 0
_cdf[val > self.maximum] = 1

Comment on lines +852 to +853
# return self.check_ln_prob(sample, ln_prob,
# normalized=normalized)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure it was, but I'll double check. I think check_ln_prob was problematic in some way.

Comment on lines 877 to 902
self[key].least_recently_sampled = result[key]
if isinstance(self[key], JointPrior) and self[key].dist.distname not in joint:
joint[self[key].dist.distname] = [key]
elif isinstance(self[key], JointPrior):
joint[self[key].dist.distname].append(key)
for names in joint.values():
# this is needed to unpack how joint prior rescaling works
# as an example of a joint prior over {a, b, c, d} we might
# get the following based on the order within the joint prior
# {a: [], b: [], c: [1, 2, 3, 4], d: []}
# -> [1, 2, 3, 4]
# -> {a: 1, b: 2, c: 3, d: 4}
values = list()
for key in names:
values = np.concatenate([values, result[key]])
for key, value in zip(names, values):
result[key] = value

def safe_flatten(value):
"""
this is gross but can be removed whenever we switch to returning
arrays, flatten converts 0-d arrays to 1-d so has to be special
cased
"""
if isinstance(value, (float, int)):
return value
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is in line with one of the other open PRs to update this logic. I'll dig it out in my next pass.

Comment on lines +250 to +251
# delta_x = ifos[0].geometry.vertex - ifos[1].geometry.vertex
# theta, phi = zenith_azimuth_to_theta_phi(zenith, azimuth, delta_x)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# delta_x = ifos[0].geometry.vertex - ifos[1].geometry.vertex
# theta, phi = zenith_azimuth_to_theta_phi(zenith, azimuth, delta_x)

The natural logarithm of the bessel function
"""
return np.log(i0e(value)) + np.abs(value)
xp = array_module(value)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment to self: use xp_wrap here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request refactoring to discuss To be discussed on an upcoming call >100 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants