Affected file
systems/bible-book-chapter-verse.yaml
Citation system
Bible book-chapter-verse (currently labelled "OSIS-style")
What is wrong or missing?
The current locator_regex requires the book component to start with a letter:
locator_regex: '^(?<book>[A-Za-z][A-Za-z0-9_]*)\.(?<chapter>[1-9][0-9]*)\.(?<verse>[1-9][0-9]*)$'
This rejects every digit-initial numbered book — both the OSIS abbreviations (1Cor, 2Sam, 3John) and the full-name forms (1Corinthians, 2Samuel, 3John). As soon as the registry adds anything outside the four Gospels / Pentateuch / Psalms it will hit this.
Two related problems sit beside the regex:
- Vocabulary is unpinned. The profile is labelled "OSIS-style" but the examples (
Genesis.1.1, Matthew.5.3) use full names, not OSIS codes (which would be Gen.1.1, Matt.5.3). It is currently unclear whether the canonical vocabulary is OSIS codes or full names; a regex cannot enforce either way.
- Case folding is open. Today
John.3.16, john.3.16, and JOHN.3.16 all match — and would mint three distinct permanent identities. The canonical case needs to be declared.
Suggested correction or new value
Two parts:
A. Regex (mechanical, low-risk): widen to allow up to one leading digit on the book component while keeping at least one letter required (so 1.1.1 stays invalid):
locator_regex: '^(?<book>[1-4]?[A-Za-z][A-Za-z0-9_]*)\.(?<chapter>[1-9][0-9]*)\.(?<verse>[1-9][0-9]*)$'
B. Vocabulary + case (decision needed): before adjusting examples and prose, the project needs to pick:
- OSIS codes (
Gen, Matt, 1Cor) — the literature standard, and what the current label advertises; or
- Full English names (
Genesis, Matthew, 1Corinthians) — what the current examples and (per agent context) resolver comments suggest.
…and pick canonical case. After that decision, examples and the preferred_label should be updated together, and case-variant spellings added to invalid.
Happy to open a small PR for part A once a maintainer signs off on the direction; I would prefer not to push part B unilaterally.
Evidence and sources
Rights or licence note
OSIS specification is open (CrossWire). SBL Handbook is copyrighted but the abbreviation list itself is factual.
Related
This issue and the Bekker issue (filed separately) together motivate a broader spec-level question about whether locator_regex + examples is the right validation contract — Bible is the strongest case, because book vocabulary genuinely cannot be captured by regex alone. That discussion belongs in the standard repo and will be cross-linked here once filed.
Related (meta-discussion): textrefs/textrefs.org#9 — whether locator_regex + examples is the right validation contract in the first place. Bible is the strongest motivating case there.
Affected file
systems/bible-book-chapter-verse.yamlCitation system
Bible book-chapter-verse (currently labelled "OSIS-style")
What is wrong or missing?
The current
locator_regexrequires the book component to start with a letter:This rejects every digit-initial numbered book — both the OSIS abbreviations (
1Cor,2Sam,3John) and the full-name forms (1Corinthians,2Samuel,3John). As soon as the registry adds anything outside the four Gospels / Pentateuch / Psalms it will hit this.Two related problems sit beside the regex:
Genesis.1.1,Matthew.5.3) use full names, not OSIS codes (which would beGen.1.1,Matt.5.3). It is currently unclear whether the canonical vocabulary is OSIS codes or full names; a regex cannot enforce either way.John.3.16,john.3.16, andJOHN.3.16all match — and would mint three distinct permanent identities. The canonical case needs to be declared.Suggested correction or new value
Two parts:
A. Regex (mechanical, low-risk): widen to allow up to one leading digit on the book component while keeping at least one letter required (so
1.1.1stays invalid):B. Vocabulary + case (decision needed): before adjusting examples and prose, the project needs to pick:
Gen,Matt,1Cor) — the literature standard, and what the current label advertises; orGenesis,Matthew,1Corinthians) — what the current examples and (per agent context) resolver comments suggest.…and pick canonical case. After that decision, examples and the
preferred_labelshould be updated together, and case-variant spellings added toinvalid.Happy to open a small PR for part A once a maintainer signs off on the direction; I would prefer not to push part B unilaterally.
Evidence and sources
systems/bible-book-chapter-verse.yaml.Rights or licence note
OSIS specification is open (CrossWire). SBL Handbook is copyrighted but the abbreviation list itself is factual.
Related
This issue and the Bekker issue (filed separately) together motivate a broader spec-level question about whether
locator_regex+examplesis the right validation contract — Bible is the strongest case, because book vocabulary genuinely cannot be captured by regex alone. That discussion belongs in the standard repo and will be cross-linked here once filed.Related (meta-discussion): textrefs/textrefs.org#9 — whether
locator_regex+examplesis the right validation contract in the first place. Bible is the strongest motivating case there.