Skip to content

RNG-188: Add Philox4x32 and Philox4x64 random number generators#191

Merged
aherbert merged 5 commits intoapache:masterfrom
jherekhealy:philox
Feb 12, 2026
Merged

RNG-188: Add Philox4x32 and Philox4x64 random number generators#191
aherbert merged 5 commits intoapache:masterfrom
jherekhealy:philox

Conversation

@jherekhealy
Copy link
Contributor

Dear commons-rng team,

This is a pull request to provide two new random number generators: philox4x32 and philox4x64 from https://www.thesalmons.org/john/random123/

Those are quite standard nowadays (part of CUDA, Numpy, default for Pytorch on GPU, default for TensorFlow) and there is no official Java implementation. The proposed implementation matches Numpy and Python's randomgen numbers. It passes all commons-rng unit tests.

Counter-based PRNGs are great for parallelization, as skipping-ahead is nearly as fast as a single random number generation, and the counter makes it very easy to create subsequences. One has complete control then on how to split.

In addition I also have added two new methods to convert a long to a double or two ints two a double in the (0,1) open interval. This is useful for many applications such as Monte-Carlo simulations, where uniform random numbers are mapped to some distribution. One common technique is to rely on the inverse cumulative distribution function, which is only defined on the open interval.

Best wishes,

Jherek

P.S.: below are some performance numbers:

NextDoubleGenerationPerformance.nextDouble                   BASELINE  avgt   10   0,390 ±  0,001  ns/op
NextDoubleGenerationPerformance.nextDouble                        JDK  avgt   10  14,578 ±  0,007  ns/op
NextDoubleGenerationPerformance.nextDouble                PHILOX_4X32  avgt   10   7,121 ±  0,007  ns/op
NextDoubleGenerationPerformance.nextDouble                PHILOX_4X64  avgt   10   8,059 ±  0,009  ns/op
NextDoubleGenerationPerformance.nextDouble                 WELL_512_A  avgt   10   4,709 ±  0,005  ns/op

As you can see they are not the fastest generators, but are for most purposes fast enough and the ease and flexibility of parallelization makes up for the generation speed. The 64-bit version could be faster with Math.unsignedMultipliedHigh of Java 19. I however found it problematic to use an MR-Jar (which would be ideal for this) with Jacoco.

Copy link
Contributor

@aherbert aherbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution.

Did you code this from the referenced paper? I think there are some edge cases around jumping when the counter is negative or approaches overflow that need to be tested against a reference implementation. This is particularly due to the use of signed integers in Java which may not be accounted for in a reference paper using e.g. c.

The int provider does not reset the internal state after one of the jumps. The long provider does though.

All of the comments for the int provider seem to apply to the long provider as the code is very similar just with longs. Please review that class too when updating.

I think the addition of a method for the open interval should be in a separate PR. It would be best to discuss its integration into the code on the mailing list.

I will have a look at running this RNG through the benchmarking code (TestU01 BigCrush and PractRand). Is the implementation expected to pass these test suites?

int old = counter0;
counter0 += nlo;
if (Integer.compareUnsigned(counter0, old) < 0) {
nhi++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nhi is -1 (all bits set) then this increment will not be carried to counter2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code will be removed as per you suggestion to implement this later with Interface RandomGenerator.ArbitrarilyJumpableGenerator

counter3 = state[5];
bufferPosition = state[6];
super.setStateInternal(c[1]);
rand10();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling rand10 would change the state and prevent restoring to exactly the same state saved by getStateInternal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to rand10() is actually there to restore the buffer. rand10() only updates the buffer, based on the current state (key and counter). I added a comment in the code.

final Philox4x32 copy = copy();
incrementCounter(1L << 32);
rand10();
resetCachedState();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why call resetCachedState here but not in the longJump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should indeed be called in both places.

* @param seed key0,key1,counter0,counter1,counter2,counter3.
*/
@SuppressWarnings("PMD.AvoidLiteralsInIfCondition")
public Philox4x32(int... seed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these constructors are not required. Just a single constructor accepting int[] would be fine.

This can be simplified to:

input = seed.length < 6 ? Arrays.copyOf(seed, 6) : seed;
key0 = input[0]
...
counter3 = input[5];
bufferPosition = PHILOX_BUFFER_SIZE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much nicer indeed!

* @param seed key for Philox
* @param subsequence counter third and fourth ints.
*/
public void resetState(long seed, long subsequence) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resetState, setOffset and getOffset method are unique to this class and do not fit within the current API of the RNGs in the library. I am not sure if this support is required other than to demonstrate the arbitrary jumping ability of the RNG.

In the case of the jump(long n) method it may require some thought for a new addition to the JumpableUniformRandomProvider interface for configurable jump size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The idea was to provide a more user friendly interface to the "raw" key and counter. But I get your point. It may create ambiguities.

The jump(long) is indeed an attractive feature of those counter based prngs. But a clever user may implement the same directly on the int[] constructor (with the caveat that jumps need to be a multiple of 4) then. It also look like the Interface RandomGenerator.ArbitrarilyJumpableGenerator interface would be more appropriate to implement, as you suggested.


import org.apache.commons.rng.UniformRandomProvider;
import org.apache.commons.rng.RestorableUniformRandomProvider;
import org.apache.commons.rng.UniformRandomProvider;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The imports here have been sorted. I am trying to understand when the order in master currently represents. It seems to be partly in historical order of when providers were added. I don't think this is needed but it would be best to remove the reordering here to make the PR addition more clear and I can fix the order in master separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

*/
PHILOX_4X32(ProviderBuilder.RandomSourceInternal.PHILOX_4X32),
/**
* Source of randomness is {@link org.apache.commons.rng.core.source32.Philox4x32}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source64

*/
@Override
public int next() {
if (bufferPosition < PHILOX_BUFFER_SIZE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some strange performance issues on JVMs with the use of post increments. This is something to be avoided when working with arrays if the same thing can be achieved with a pre-increment counter (offset by 1 for the range checks) or use of a temp counter.

I wonder if this would be more performant:

final int p = bufferPosition;
if (p < PHILOX_BUFFER_SIZE) {
    bufferPosition = p + 1;
    return buffer[p];
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

k1 += PHILOX_W1;
singleRound(buffer, k0, k1);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double line break

int origValue = rngOrig.nextInt();
assertEquals(origValue, jumpValue);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra line breaks

@aherbert
Copy link
Contributor

aherbert commented Feb 9, 2026

I've had a bit more time to review this in comparison to the other jumpable RNGs. The smallest jump supported by others is 2^64. The jumps are all documents in the same way. So you can get an idea of the jump size using:

git grep  -A  2 'The jump size is the equivalent of'

I think the Philox 32-bit generator can be updated to jump 42^64 and 42^96. This would make the jumps similar to those provided by the AbstractXoShiRo128 generators: XoShiRo128Plus, XoShiRo128PlusPlus, XoShiRo128StarStar. This size of jump is much easier to implement. The small jump increments counter2 and overflows to counter3; the long jump increments counter3.

A similar change in the 64-bit generator would have jumps of 42^128 and 42^196.

The purpose of long jumping is to move far enough ahead to be able to create a stream of jumpable generators, where each jumpable generator can stream effectively unlimited RNGs which will not overlap within reason. A jump size of 42^32 creates a generator that could overlap with the next one in a long running use of RNG output. The next size up of 42^64 for each output would require significantly more time to overlap its sequence.

If you can copy the javadoc wording from the other generators to state the jump size and expected output length of each child generator that would be helpful.

As for configurable jumping I believe we require some interface similar to the JDK's:
RandomGenerator.ArbitrarilyJumpableGenerator. This functionality can be added after the main Philox generator implementation is merged.

@aherbert aherbert changed the title add Philox4x32 and Philox4x64 random number generators RNG-188: Add Philox4x32 and Philox4x64 random number generators Feb 10, 2026
@jherekhealy
Copy link
Contributor Author

Thanks for your detailed comments. I have attempted to address most of them in the latest commit.
Regarding your question on TestU01, yes, it passes BigCrush - this is detailed in the original paper https://dl.acm.org/doi/epdf/10.1145/2063384.2063405

Copy link
Contributor

@aherbert aherbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I noticed the error with resetting the state is still present after long jump of the 32-bit RNG.

The library typically does not support many constructors for each RNG. It is expected that a RNG has only a single constructor to accept the full length seed. Typical use of the generators would be through the commons-rng-simple package where you can create a generator with all types of seeds using the seed conversion routines.

I would remove all but the array constructor. Then put the full seed for the expected output sequence into the unit test. See for example the other tests in the source. One way to test a few seeds is to stream the seed and expected output to a parameterized test. See for example the L64X256MixTest. Your case would require the seeds 1234 and 67280421310721L.

final Philox4x32 copy = copy();
counter3++;
rand10();
return copy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing a call to resetCachedState()

@Test
void testLongJumpCounter() {
Philox4x64 rng = new Philox4x64(new long[]{1234L, 0, 0xffffffffffffffffL, 0, 0xffffffffffffffffL, 0});
UniformRandomProvider rngOrig = rng.jump();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rngOrig is unused.

@Test
void testLongJumpCounter() {
Philox4x32 rng = new Philox4x32(new int[]{1234, 0, 0xffffffff, 0xffffffff, 0xffffffff, 0});
UniformRandomProvider rngOrig = rng.longJump();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rngOrig is not used.

@Test
void testJumpCounter() {
Philox4x32 rng = new Philox4x32(new int[]{1234, 0, 0xffffffff, 0xffffffff, 0xffffffff, 0});
UniformRandomProvider rngOrig = rng.jump();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rngOrig is not used.

}

@Test
void testDouble() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not required.

Note that this test is doomed by using a xor of the first 21-bits for the upper (v) and the full 32-bits of the lower (w) when comparing to the nextDouble method which composes the most significant 26 from v with 27 from w.

Anyway remove this redundant test.


@Test
void testInternalCounter() {
//test of incrementCounter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests could use: TestUtils.assertNextIntEquals. Matching a single output is unlikely but possible with different seeds. Matching a sequence of output is increasingly unlikely with length.


@Test
void testInternalCounter() {
//test of incrementCounter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests could use: TestUtils.assertNextLongEquals. Matching a single output is unlikely but possible with different seeds. Matching a sequence of output is increasingly unlikely with length.

case SPLIT_MIX_64:
case TWO_CMRES:
case TWO_CMRES_SELECT:
case PHILOX_4X32:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generator has a native seed size of 6.

@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.71%. Comparing base (1e43bd4) to head (2c64b0e).
⚠️ Report is 179 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #191      +/-   ##
============================================
+ Coverage     99.69%   99.71%   +0.01%     
+ Complexity     1495      810     -685     
============================================
  Files           152      154       +2     
  Lines          5996     6267     +271     
  Branches        564      576      +12     
============================================
+ Hits           5978     6249     +271     
  Misses           12       12              
  Partials          6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aherbert
Copy link
Contributor

I am fine to merge this when it passes the CI build. Currently it is failing on a malformed javadoc tag. I do not know why so it may just need reformatting to a single line. If you can run the default maven goal locally (using mvn) then you should see the error.

@aherbert
Copy link
Contributor

Sorry, the master branch run an invalid integration test. I've corrected the dependencies for that. If you rebase on master then hopefully this will pass CI on all tested JDKs.

@aherbert aherbert merged commit e30ab4b into apache:master Feb 12, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants