Affected file
systems/bekker.yaml
Citation system
Bekker numbering (Aristotelian corpus)
What is wrong or missing?
The current locator_regex requires the page component to be 3–4 digits:
locator_regex: '^(?<page>[0-9]{3,4})(?<column>[ab])(?<line>[0-9]{1,2})$'
Bekker pagination, however, begins at page 1. Valid early Aristotelian references such as 1a1, 16a1, 24b10, and 99a5 therefore fail validation. The issue is currently latent because the only registered Aristotle work (Nicomachean Ethics) sits at four-digit pages (1094a–1181b), but it would block registering anything from the opening of the corpus (e.g. Categories, De Interpretatione).
The current regex also accepts non-canonical forms:
- leading-zero pages such as
0983b10
- line zero such as
983b0
- leading-zero lines such as
983b01
Because the deterministic UUID seed uses the exact normalized locator, each of these spellings would mint a distinct permanent identity for the same passage.
Suggested correction or new value
Tighten the regex to require pages 1–9999 and lines 1–99, both without leading zeros:
locator_regex: '^(?<page>[1-9][0-9]{0,3})(?<column>[ab])(?<line>[1-9][0-9]?)$'
examples:
valid:
- '1a1'
- '16a1'
- '24b10'
- '1094a1'
- '1462b10'
invalid:
- '983'
- '983c10'
- '983b'
- '983b100'
- '0983b10'
- '983b0'
- '983b01'
Happy to open a small PR if that is useful.
Evidence and sources
- Bekker, August Immanuel (ed.), Aristotelis Opera, Berlin 1831 — pagination runs from p. 1.
- Examples of early-corpus references: Aristotle, Categories 1a1 (
https://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.01.0051).
- Current registry profile:
systems/bekker.yaml.
Rights or licence note
Open. Bekker 1831 is public-domain; Perseus links are CC-BY-SA.
Related
This is one of two concrete cases (the other is bible-book-chapter-verse, filed separately) that motivate a broader spec-level question about whether locator_regex + examples is the right validation contract for citation-system profiles. That meta-discussion belongs in the standard repo; this issue is intentionally narrow.
Related (meta-discussion): textrefs/textrefs.org#9 — whether locator_regex + examples is the right validation contract in the first place.
Affected file
systems/bekker.yamlCitation system
Bekker numbering (Aristotelian corpus)
What is wrong or missing?
The current
locator_regexrequires the page component to be 3–4 digits:Bekker pagination, however, begins at page 1. Valid early Aristotelian references such as
1a1,16a1,24b10, and99a5therefore fail validation. The issue is currently latent because the only registered Aristotle work (Nicomachean Ethics) sits at four-digit pages (1094a–1181b), but it would block registering anything from the opening of the corpus (e.g. Categories, De Interpretatione).The current regex also accepts non-canonical forms:
0983b10983b0983b01Because the deterministic UUID seed uses the exact normalized locator, each of these spellings would mint a distinct permanent identity for the same passage.
Suggested correction or new value
Tighten the regex to require pages 1–9999 and lines 1–99, both without leading zeros:
Happy to open a small PR if that is useful.
Evidence and sources
https://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.01.0051).systems/bekker.yaml.Rights or licence note
Open. Bekker 1831 is public-domain; Perseus links are CC-BY-SA.
Related
This is one of two concrete cases (the other is
bible-book-chapter-verse, filed separately) that motivate a broader spec-level question about whetherlocator_regex+examplesis the right validation contract for citation-system profiles. That meta-discussion belongs in the standard repo; this issue is intentionally narrow.Related (meta-discussion): textrefs/textrefs.org#9 — whether
locator_regex+examplesis the right validation contract in the first place.