Add AGENTS.md by Scienfitz · Pull Request #769 · emdgroup/baybe

Scienfitz · 2026-03-29T22:18:39Z

AGENTS.md files contain content intended for agentic operators. They are recognized by most coding frameworks (most importantly claude and opencode) and are injected into the context whenever an agent reads a folder where such a file is contained. They lead to more consistent code being generated and generally more in line with what has already been done without explicitly having to state this over and over again.

The content here is meant as a start and not as complete. We can continue to add rules as we evolve.

The content has been produced in the following manner:

Analyze the entire code base and the last 200 PRs including comments
Extract from this rules and conventions that apply to this repo
The resulting files have then been compressed
I then did one coarse round of human review and curation

Copilot

Pull request overview

Adds AGENTS.md guidance files intended to be injected into agentic tooling context to steer code, tests, docs, and examples toward existing BayBE conventions.

Changes:

Introduce root-level AI-agent coding guide (AGENTS.md) covering architecture, typing, imports, CI, and workflow.
Add test-suite conventions for pytest structure/fixtures/parametrization (tests/AGENTS.md).
Add docs and examples conventions for Sphinx/MyST and runnable scripts (docs/AGENTS.md, examples/AGENTS.md).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File	Description
`AGENTS.md`	Project-wide agent guidance for BayBE coding patterns, tooling, and PR workflow
`tests/AGENTS.md`	Conventions for writing/organizing pytest tests and fixtures
`docs/AGENTS.md`	Conventions for Sphinx/MyST docs authoring and syntax
`examples/AGENTS.md`	Conventions for executable examples and CI smoke-test behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/AGENTS.md

AGENTS.md

tests/AGENTS.md

docs/AGENTS.md

tests/AGENTS.md

examples/AGENTS.md

AGENTS.md

m-aebrer · 2026-03-30T13:47:55Z

tests/AGENTS.md

I would add this rather in contributing.md instead of a tests specific agents.md, though I don't think that's a major issue.

I also usually add these test principles to get better results (they assume you have pre-commit hooks to run the tests, which you should):

Testing Conventions

Summary:

Test through public interfaces: call functions, assert on return values and file system side effects

Mock only external boundaries: the [INSERT BOUNDARY POINT HERE] is the boundary — provide a [SPECIFIC MOCKER HERE, WITH DEFINED NARROW SCOPE] that emits [PROJECT SPECIFIC DETAILS]

One test, one behavior: don't combine assertions about different concerns

Tests as specs: names like test_qc_fail_triggers_retry, test_missing_smiles_fails_validation

Run tests: python -m pytest backend/tests/

Pre-commit hook runs tests automatically before each commit. On a fresh clone, activate it once:

git config core.hooksPath .githooks

total test suite runs 20-30 minutes so we cant run them in a high frequent manner such as pre-commit, would probably rather instruct to run tests relevant to the respective feature

otherwise all looks good to me. I would say there may be some issues with lack of compliance for the anti-patterns as some of them are a bit underspecified, eg: No conftest pollution — prefer local fixtures. -> "pollution" would go better with examples of before and after for example

ah @Scienfitz good point on too many tests. In that case I have CI run the full suite, and locally I have a subset using tags to only run those ones on pre-commit hook. This usually also prompts the agent naturally to run any task specific tests as well, as a reminder.

You can also just add a hook that only injects a warning/reminder about running tests into the chat context.

m-aebrer · 2026-03-30T13:51:54Z

AGENTS.md

seems like you have a no fallback rule.

I have run into issues with this before, here's my section about it:

## Zero Fallback Principle Execution MUST abort immediately on any missing dependency, malformed data field, absent required column/key, unexpected enum value, or structural schema mismatch. Do **not** continue in a degraded or "best effort" mode. No silent defaults. No guessing. Expensive downstream computation must be prevented when prerequisites are not perfectly satisfied. ## Validation Philosophy - **Per-Template Strict Validation**: Each template defines exact allowed factor choices; models cannot select factors outside their template. - **Validation at Inference Time**: PromptSwarm validates outputs using the universal `ConditionRecommendationModel` with the template's `_allowed_choices` passed as validation context. - **Zero Fallback**: Any validation failure TERMINATES that template's inference (circuit breaker pattern in promptswarm). - **No Partial Results**: Invalid outputs are not written to the WFM database; no partial pipelines or continued processing.

Then I also have a hook that autodetects when Claude is trying to add a fallback and stops him:

https://github.com/merckgroup/condition_rec_benchmarking/blob/main/.claude/hooks/check-fallback.sh

I've extracted some generalizable principles out of this and added to the file under the respective sections (590b521)

well discuss whether we can do something with the actual hook as well, in any case thanks for your input 🙌

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

To comply with agent instruction.

LeanAndMean · 2026-03-30T21:21:25Z

Hey, nice work on the recent updates -- the fail-fast language in sections 2/8/16 reads well, and the admonition rewrite is much clearer.

I've been reviewing this alongside how I structure similar files in my repos and wanted to share a few thoughts. Happy to help implement any of these if useful, but also easy for you to pick up directly since they're mostly structural.

CLAUDE.md + symlinks: Have you considered making the canonical file CLAUDE.md? Claude Code automatically loads it into every conversation, and it supports subdirectory scoping -- a tests/CLAUDE.md is auto-injected when the agent works on files in tests/. That's a perfect match for the distributed structure you've set up here.

To keep other tools working, symlink from the root:

AGENTS.md -> CLAUDE.md
.github/copilot-instructions.md -> CLAUDE.md

Single source of truth, three consumers. The subdirectory files (tests/CLAUDE.md, etc.) don't need symlinks since the other tools don't have equivalent scoping.

Complementary content from /init: I ran Claude Code's /init command on this repo and it generated a CLAUDE.md that has some nice qualities that could complement what you've already built here:

A full commands section covering install, test (pytest --fast, pytest -k "test_name"), lint, typecheck, and all tox environments -- including the python version append convention (tox -e mypy-py310) and tox -p for parallel runs. Your AGENTS.md covers conventions thoroughly but doesn't tell the agent how to validate its work; this fills that gap.
The architecture section maps each domain concept to its file path (Campaign -> baybe/campaign.py, SearchSpace -> baybe/searchspace/, etc.), which gives an agent both "what is this" and "where to find it" in one pass.
It distills design principles to the three an agent is most likely to violate: comp-rep boundary, lazy imports ("Non-negotiable"), and the serialization pattern. Useful as a complement to the fuller treatment in your AGENTS.md.
It's 161 lines vs the current ~270, which matters for token budget since the root file is injected into every conversation.

I'm not suggesting replacing your work -- the AGENTS.md has valuable depth (validation patterns, deprecation strategy, the naming table, fixture architecture) that the /init output doesn't cover. But the commands and architecture-to-filepath mapping would be easy to fold in on your end. I can share the generated file if you'd like to compare.

Anti-pattern specificity: As m-aebrer mentioned, rules like "prefer local fixtures" are easier for agents to comply with when there's a concrete example. Even a one-liner showing the preferred pattern would help.

Let me know if any of this is useful -- happy to put together changes on a separate branch or contribute directly here, whichever you'd prefer.

Scienfitz

@LeanAndMean many thanks for the input. I'm turning your coment into several threads because otherwise it will be nearly impossible to converse about thes eeveral sub-suggestions made therein, feel free to comment in the respective threads

Scienfitz · 2026-03-30T23:05:29Z

AGENTS.md

via Kevin: Rename to CLAUDE.md + symlink for other tools

Consider making CLAUDE.md the canonical file (auto-loaded by Claude Code, supports subdirectory scoping). Symlink AGENTS.md and .github/copilot-instructions.md to it so all three tools share a single source of truth.

@LeanAndMean not sure if I got your suggestions because that was already fully the idea of the AGENTS.md files. Afaik tools like claude or opencode alreayd recognize AGENTS.md (including auto-context-ingestion and subdirectory scoping) and do not really need CLAUDE.md or any other more specifically named file.

I also would not tailor the file too strictly to Claude. Even though the majority of people might use claude by now, I would prefer to keep it general as other people might use other tools. Not sure if people might think "Oh no, I have to use Claude when working with Agentic AI and BayBE" which I'd like to avoid.

AFAIK Claude Code CLI will only operate "properly" if you use the CLAUDE.md convention. Here's what Claude Code CLI had to say about it, after reviewing its own usage guide:

❯ Research whether Claude Code CLI recognizes AGENTS.md in the absence of CLAUDE.md ● claude-code-guide(Research AGENTS.md recognition) ⎿ Done (5 tool uses · 40.6k tokens · 23s) (ctrl+o to expand) ● Claude Code does not recognize AGENTS.md — it only loads CLAUDE.md files. Details - AGENTS.md is a convention used by other coding agents (e.g., Cursor, Copilot), not Claude Code. - Claude Code looks for CLAUDE.md in these locations: - ./CLAUDE.md or ./.claude/CLAUDE.md (project root) - ~/.claude/CLAUDE.md (user home) - Subdirectories (loaded on demand when entering them) - If CLAUDE.md doesn't exist, Claude Code simply has no project instructions — it does not fall back to AGENTS.md. If you need both You can bridge them using the @ import syntax in CLAUDE.md: @AGENTS.md ## Claude-specific instructions ... This way both tools share the same base instructions without duplication. Sources: - https://code.claude.com/docs/en/memory.md

Using @AGENTS.md in CLAUDE.md is another option - it inlines the file contents directly. I go with symlinks because they don't rely on any code harness supporting inlining files (a more futureproof strategy).

The unfortunate truth is that Claude was first, created CLAUDE.md, then the industry followed, creating AGENTS.md (similar to Nvidia with CUDA, and everyone else with OpenCL). The only way to not impose using one coding harness over another is to have both in your library.

The day may come with Claude Code CLI supports AGENTS.md, but it is not this day.

hmmk wasnt aware of that (using opencode myself) - surprising to say the least (maybe now that the source code got hacked they'll include it somehow :)

in that case CLAUDE.md with symlink is the obvious goto

Scienfitz · 2026-03-30T23:05:47Z

AGENTS.md

via Kevin: Add a commands/validation section

The current file covers conventions but doesn't tell the agent how to validate its work. Add a commands section covering install, test (pytest --fast, pytest -k "test_name"), lint, typecheck, and tox environments (including tox -e mypy-py310 and tox -p for parallel).

I like the idea maybe we could put that in or expand CONTROBUTING and link it in AGENTS

I feel like having some important example commands in AGENTS.md is important for also showing AI agents how to use your library the way you do, but your mileage may vary.

The best way to figure this out is with testing. Try developing with and without this change - see if you can tell the difference. If you can't, then it doesn't need to be in there.

Scienfitz · 2026-03-30T23:06:04Z

AGENTS.md

via Kevin: Add an architecture-to-filepath mapping

Map each core domain concept to its file path (e.g. Campaign → baybe/campaign.py, SearchSpace → baybe/searchspace/). Gives agents "what is this" and "where to find it" in one pass.

This can be helpful for reducing the search time for agents as well as reducing the number of times they decide to not look for something when they should.
My policy is just to use the default /init with Claude Code CLI, then tell Claude to add/update the filepath mapping if I see it struggling to find things (or after major changes to the repo).

Scienfitz · 2026-03-30T23:06:25Z

AGENTS.md

via Kevin: Distill key design principles for agents

Highlight the 2–3 principles agents are most likely to violate: comp-rep boundary, lazy imports ("non-negotiable"), and the serialization pattern. Complements the fuller treatment already in the file.

This tends to reduce the amount of human interventions and corrections you need to perform while developing. If the agents know your value system, it will tend to produce things closer to what you're looking for, and it requires less human effort to get a high quality result. Often, >99% of developers values are aligned, so you don't need to tell it that (E.g., "Don't write bugging code"). It tends to be when there's multiple viable options (E.g., Do we want fallbacks or not?) when it becomes necessary to define your terms. The "best" AGENTS.md/CLAUDE.md will contain only the things necessary to get the desired behavior. It's a waste of tokens to tell AI agents to do something they were going to do anyway.

Again, only through testing will you figure out what the right balance is, because it's not clear what values they hold implicitly in different situations.

AGENTS.md

Scienfitz · 2026-03-30T23:07:10Z

AGENTS.md

via Kevin and Drew: Add concrete examples to anti-pattern rules

Rules like "prefer local fixtures" are easier for agents to follow with a one-liner showing the preferred pattern. Even minimal examples help.

AVHopp · 2026-03-31T07:11:06Z

AGENTS.md

The links in this file curretly break doc building. We could either exclude it fully from the documentation or only from linkcheck, as I think trying to adjust the links such that they work in the compiled doc might then not make sense for the agents anymore (but we could also try this)). Opinions?

should be excluded

m-aebrer · 2026-03-31T14:52:18Z

also +1 to @LeanAndMean's point about CLAUDE.md vs AGENTS.md. I have my harness dreb set to load both, but as I switch to AGENTS.md for new projects, I have been making nearly empty CLAUDE.md files that just inline AGENTS.md, and it works pretty much exactly as you'd hope. Very effective solution.

tobiasploetz · 2026-04-01T07:43:06Z

AGENTS.md

+  properties, 4) methods. Within each group use alphabetical order.
+
+### Attribute Docstrings
+String literals immediately below field declarations, blank lines between attributes.


add example?

tobiasploetz · 2026-04-01T07:44:21Z

AGENTS.md

+- Sphinx roles for cross-refs: `:func:`, `:class:`, `:meth:`. Double backticks for
+  literals.
+- Attrs validators get `# noqa: DOC101, DOC103` (pydoclint confused by
+  `(self, attribute, value)` signature).


add example?

tobiasploetz · 2026-04-01T07:46:42Z

AGENTS.md

testing guidelines missing?

I would also add a small collection of the main abstractions in BayBE (e.g. Campaign, Parameter, ...) so that the AI has the big picture of your package semantics

Scienfitz added 4 commits March 29, 2026 18:10

Add instructions for tests

8ab5b0e

Add instructions for examples

71ceb31

Add instructions for docs

107d1b5

Add instructions for code

31cddd7

Scienfitz self-assigned this Mar 29, 2026

Scienfitz requested review from AVHopp and AdrianSosic as code owners March 29, 2026 22:18

Scienfitz added the repo Requires changes to the project configuration label Mar 29, 2026

Copilot AI review requested due to automatic review settings March 29, 2026 22:18

Copilot started reviewing on behalf of Scienfitz March 29, 2026 22:19 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

AVHopp reviewed Mar 30, 2026

View reviewed changes

AGENTS.md Show resolved Hide resolved

AdrianSosic reviewed Mar 30, 2026

View reviewed changes

AGENTS.md Show resolved Hide resolved

m-aebrer reviewed Mar 30, 2026

View reviewed changes

Scienfitz and others added 7 commits March 30, 2026 19:59

Remove double negative

324a47d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix typo

9a13cc9

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Improve adminition instructions

cb78918

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix typo

a891a2a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix formatting

27819d0

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Add garbage collection in exceptions.py

f6e7f20

To comply with agent instruction.

Add improved early fail instructions

590b521

Scienfitz commented Mar 30, 2026

View reviewed changes

AVHopp reviewed Mar 31, 2026

View reviewed changes

tobiasploetz reviewed Apr 1, 2026

View reviewed changes

Conversation

Scienfitz commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

m-aebrer Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Testing Conventions

Uh oh!

Scienfitz Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LeanAndMean commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Scienfitz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Scienfitz Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-aebrer commented Mar 31, 2026

Scienfitz commented Mar 29, 2026 •

edited

Loading

m-aebrer Mar 30, 2026 •

edited

Loading

Scienfitz Mar 30, 2026 •

edited

Loading

LeanAndMean commented Mar 30, 2026 •

edited

Loading

Scienfitz Mar 30, 2026 •

edited

Loading