Skip to content

Add ECMA-376 schema validation as a CI gate for emitted document.xml #214

@stevenobiajulu

Description

@stevenobiajulu

Context

The recent peer review of #208 surfaced two distinct ECMA-376 conformance gaps in validateFieldStructure (#209) that had gone unnoticed because no automated schema validation runs against engine output. The Lean proof work has shown that hand-rolled conformance checks (the validateFieldStructure family) tend to under-specify the actual schema. A general-purpose schema validator catches whole classes of these issues at once.

Real-world precedents:

  • docx4j uses a JAXB JaxbValidationEventHandler against the schema during parsing.
  • LibreOffice runs its export through XML attribute-output checks before emission.
  • Microsoft Word uses the schema-derived state machine and discards malformed structures.

What this would add

  • A CI job that runs every emitted document.xml (from the integration test corpus) through an ECMA-376 / Open Packaging Conventions validator.
  • Candidate validators to evaluate: officevalidator (Microsoft's tool), xmlstarlet val against the published XSDs, openpackaging-validator (Java).
  • The validator runs against the outputs of: lean-spec-bridge.test.ts, collapsed-field-inplace.test.ts, inPlaceModifier.test.ts, and round-trip-inplace.test.ts.
  • Failures fail the CI job and block merge.

Why this is worth doing

Notes

  • Schema validation against the full ECMA-376 schema is non-trivial — the schema is huge and validators differ in strictness. Investigate which validator gives the right strictness/false-positive balance before wiring it into required CI.
  • This is a defense-in-depth measure, not a substitute for validateFieldStructure and the per-wrapper neutrality checks. Field-context placement is a semantic constraint the schema can't express on its own.

Ref: #208, #209.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions