Add GitLab and GitHub CI tests#133
Add GitLab and GitHub CI tests#133rpgoldman wants to merge 5 commits intologic-and-learning-lab:mainfrom
Conversation
31d060c to
598a84b
Compare
|
is this different to #126 ? |
|
This is the better one. I need to check and see why adding [Notes mostly for myself:] Here's the log with error and backtrace: Failure is here: I'm not sure that this is deterministically repeatable. |
|
I reran the tests and now they all pass. That implies there's something non-deterministic here. The open question is: "did I break something and make it non-deterministic, or was it always non-deterministic?" If this is non-deterministic, and there's an error condition here, does that mean that this condition (the The secondary question is whether this behavior invalidates the test structure here. I.e., do you want to go ahead and merge this, and perhaps add a separate issue for fixing the size overflow? It's possible that we should treat this test as an expected failure, until that can be fixed (since we would like to be able to use the tests as a way to vet Popper changes). You are obviously the person to make these judgment calls. |
I think some of your code has changed. Looking at the log, when Popper generates this hypothesis: Then it should have teach this code point and returned: I do not see how any non-determinism could prevent that. I just tried your code but I cannot replicate the crash. |
This needed to get pkg_resources, which is used in Popper.
598a84b to
82c531a
Compare
|
@andrewcropper I've just run this 20 times locally, and the test on I note that your fixes to |
|
OK, I just ran all the tests locally and everything looks fine. But I got a different failure on GitHub. I wonder if this has to do with running the tests in parallel on the GH runner? I will see if tweaking the way things run helps. So for now, I am marking this as draft again. |
|
I have no idea and I do not know what CI/GH runner etc are, but I am confident that the bug is not a standalone Popper one. |
I agree — I’m not seeing this issue locally or in the GitLab CI. |
GitHub runner has only 4 cores, so lower maxparallel from 16 to 4.
|
@andrewcropper I think that I have fixed the problem on GitHub: the problem appears to have been related to the fact that I was trying to run 16 concurrent jobs on a machine (the GitHub "runner") that has only 4 cores. I dropped the number of cores and now the tests pass reliably. I hope that this will help you in maintaining and improving Popper, because it provides a way to detect changes that degrade the behavior of Popper. Of course, this kind of end-to-end test can only do so much, but it should make it easier to move 3 steps forward without moving 2 steps back! |
|
Super basic questions, which I are related to this message:
Q1. Where are the actual tests? In a docker image that is not yet available? |
The actual tests are defined in the popper-prt system. They are in the docker image, but can be inspected in that system (and a new Docker image may be created easily). More specifically, the
GitHub itself will run the tests after every push. It will also check merge requests and complain if the code in the corresponding branch does not pass the tests. This is the primary case for having these tests. The README.md for popper-prt explains how to use the docker image to run the tests against a local copy of Popper. The docker image makes things easier to use, but also overcomes a minor organizational issue. My company is making the prt library available open source, but this requires someone doing the drudge work of getting the right license text spattered everywhere before the source repository can be put on GitHub. I continue to push for this to happen, but unsurprisingly, there are few volunteers for this task. The docker image allows us to stop waiting...
The tests need no maintenance until there's a version of Popper that provides improved performance on one or more of the existing examples, or there are new examples to add to the test suite. In the former case, updated rubric file(s) would need to be stored in the popper-prt repository (which you could fork or, for that matter, we could move into the logic-and-learning-lab GitHub group). If there are new examples, one would need to add a new The main idea is to make it possible to improve Popper while being confident that one is not breaking anything. If there are any problems, you can post them here or as issues on the |
These two tests aim to do the same thing. The GitLab actions run on SIFT's local GitLab repo, and the GitHub actions are intended to run on GitHub.
The tests use SIFT's
prtregression test framework and run on the examples distributed with Popper.prtis had in the Docker image. The Docker image is built from thepopper-prtrepo, the source for which we will make available.