Skip to content

Commit cfbf786

Browse files
Merge pull request #19 from VH-Lab/claude/add-symmetry-tests-vtgOJ
Add cross-language symmetry tests for MATLAB ↔ Python DID databases
2 parents 1309b99 + ab862dc commit cfbf786

17 files changed

Lines changed: 637 additions & 11 deletions

File tree

.github/workflows/symmetry.yml

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
name: Cross-Language Symmetry Tests
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
workflow_dispatch:
9+
10+
jobs:
11+
symmetry:
12+
name: MATLAB <-> Python symmetry tests
13+
runs-on: ubuntu-latest
14+
steps:
15+
- name: Check out DID-python
16+
uses: actions/checkout@v4
17+
18+
- name: Check out DID-matlab
19+
uses: actions/checkout@v4
20+
with:
21+
repository: VH-Lab/DID-matlab
22+
path: DID-matlab
23+
24+
# --- MATLAB setup ---
25+
- name: Set up MATLAB
26+
uses: matlab-actions/setup-matlab@v2
27+
with:
28+
release: latest
29+
cache: true
30+
products: Statistics_and_Machine_Learning_Toolbox
31+
32+
- name: Install MatBox
33+
uses: ehennestad/matbox-actions/install-matbox@v1
34+
35+
- name: Install MATLAB dependencies (mksqlite etc.)
36+
uses: matlab-actions/run-command@v2
37+
with:
38+
command: |
39+
addpath(genpath("DID-matlab/src"));
40+
addpath(genpath("DID-matlab/tools"));
41+
matbox.installRequirements(fullfile(pwd, "DID-matlab"));
42+
43+
# --- Python setup ---
44+
- name: Set up Python 3.12
45+
uses: actions/setup-python@v5
46+
with:
47+
python-version: "3.12"
48+
49+
- name: Install Python dependencies
50+
run: |
51+
python -m pip install --upgrade pip
52+
pip install -e ".[dev]"
53+
54+
# --- Step 1: MATLAB makeArtifacts ---
55+
- name: "Step 1: MATLAB makeArtifact tests"
56+
uses: matlab-actions/run-command@v2
57+
with:
58+
command: |
59+
addpath(genpath("DID-matlab/src"));
60+
addpath(genpath("DID-matlab/tests"));
61+
addpath(genpath("DID-matlab/tools"));
62+
import matlab.unittest.TestRunner;
63+
import matlab.unittest.TestSuite;
64+
runner = TestRunner.withTextOutput;
65+
makeSuite = TestSuite.fromPackage("did.symmetry.makeArtifacts", "IncludingSubpackages", true);
66+
makeResults = runner.run(makeSuite);
67+
disp(table(makeResults));
68+
assert(all([makeResults.Passed]), "makeArtifacts tests failed");
69+
70+
# --- Step 2: Python makeArtifacts + readArtifacts ---
71+
- name: "Step 2: Python makeArtifact tests"
72+
run: pytest -m make_artifacts -v
73+
74+
- name: "Step 2: Python readArtifact tests"
75+
run: pytest -m read_artifacts -v
76+
77+
# --- Step 3: MATLAB readArtifacts ---
78+
- name: "Step 3: MATLAB readArtifact tests"
79+
uses: matlab-actions/run-command@v2
80+
with:
81+
command: |
82+
addpath(genpath("DID-matlab/src"));
83+
addpath(genpath("DID-matlab/tests"));
84+
addpath(genpath("DID-matlab/tools"));
85+
import matlab.unittest.TestRunner;
86+
import matlab.unittest.TestSuite;
87+
runner = TestRunner.withTextOutput;
88+
readSuite = TestSuite.fromPackage("did.symmetry.readArtifacts", "IncludingSubpackages", true);
89+
readResults = runner.run(readSuite);
90+
disp(table(readResults));
91+
nFailed = sum([readResults.Failed]);
92+
nPassed = sum([readResults.Passed]);
93+
nSkipped = sum([readResults.Incomplete]);
94+
fprintf("Results: %d passed, %d failed, %d skipped\n", nPassed, nFailed, nSkipped);
95+
assert(nFailed == 0, "readArtifacts tests failed");

README.md

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,55 @@ The `did` library provides a framework for managing and querying data that is or
4343

4444
You can run the tests using either `pytest` (if you installed the development dependencies) or the standard `unittest` module.
4545

46-
**Using pytest (Recommended for development):**
46+
**Run all tests (unit + symmetry):**
4747
```bash
4848
pytest
4949
```
5050

51-
**Using unittest (Standard):**
51+
**Run only the unit tests (excluding symmetry tests):**
52+
```bash
53+
pytest tests/ --ignore=tests/symmetry
54+
```
55+
56+
**Run only the symmetry tests:**
57+
```bash
58+
pytest -m symmetry
59+
```
60+
61+
**Run only the makeArtifact symmetry tests** (generate cross-language artifacts):
62+
```bash
63+
pytest -m make_artifacts
64+
```
65+
66+
**Run only the readArtifact symmetry tests** (validate artifacts from Python and/or MATLAB):
67+
```bash
68+
pytest -m read_artifacts
69+
```
70+
71+
**Using unittest (unit tests only):**
5272
```bash
5373
python -m unittest discover tests
5474
```
5575

56-
Both commands will discover and run all the tests in the `tests` directory.
76+
#### Symmetry Tests
77+
78+
The `tests/symmetry/` directory contains cross-language symmetry tests that verify
79+
DID databases created in Python can be read by MATLAB and vice versa:
80+
81+
* **`make_artifacts/`** — Creates a DID database with multiple branches and
82+
documents, then writes the database file and JSON summary artifacts to a
83+
well-known temporary directory
84+
(`<tempdir>/DID/symmetryTest/pythonArtifacts/`).
85+
* **`read_artifacts/`** — Reads artifacts produced by either the Python or
86+
MATLAB test suite, re-summarizes the live database, and compares the
87+
result against the saved summary. Tests are parameterized over
88+
`matlabArtifacts` and `pythonArtifacts` and skip gracefully when
89+
artifacts from a given source are not available.
90+
91+
The CI workflow runs the full cross-language cycle:
92+
1. MATLAB `makeArtifact` tests create artifacts
93+
2. Python `makeArtifact` and `readArtifact` tests run (reading MATLAB artifacts)
94+
3. MATLAB `readArtifact` tests run (reading Python artifacts)
5795

5896
## Documentation
5997

pyproject.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,14 @@ dev = [
2727
"pytest",
2828
]
2929

30+
[tool.pytest.ini_options]
31+
testpaths = ["tests"]
32+
markers = [
33+
"symmetry: cross-language symmetry tests (MATLAB <-> Python)",
34+
"make_artifacts: tests that generate artifacts for symmetry testing",
35+
"read_artifacts: tests that read and validate artifacts from another implementation",
36+
]
37+
3038
[tool.setuptools.packages.find]
3139
where = ["src"]
3240

src/did/datastructures.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -267,8 +267,7 @@ def field_search(a, search_struct):
267267
b = True
268268
break
269269
elif op_lower == "or":
270-
if isinstance(param1, dict) and isinstance(param2, dict):
271-
b = field_search(a, param1) or field_search(a, param2)
270+
b = field_search(a, param1) or field_search(a, param2)
272271
elif op_lower == "depends_on":
273272
# param1 = dependency name, param2 = dependency value
274273
if "depends_on" in a:

src/did/document.py

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,18 @@ def __init__(self, document_type="base", **kwargs):
1616
self.document_properties["base"]["datestamp"] = str(datetime.utcnow())
1717

1818
for key, value in kwargs.items():
19-
# This is a simplified way to set properties. A full implementation
20-
# would need to handle nested properties like 'base.name'.
21-
if key in self.document_properties:
22-
self.document_properties[key] = value
19+
path = key.split(".")
20+
if len(path) == 1:
21+
if key in self.document_properties:
22+
self.document_properties[key] = value
23+
else:
24+
d = self.document_properties
25+
for p in path[:-1]:
26+
existing = d.get(p)
27+
if not isinstance(existing, dict):
28+
d[p] = {}
29+
d = d[p]
30+
d[path[-1]] = value
2331

2432
self._reset_file_info()
2533

@@ -95,6 +103,8 @@ def read_blank_definition(json_file_location_string):
95103
# Ensure the 'base' key exists
96104
if "base" not in data:
97105
data["base"] = {}
106+
# Convert flat classname/superclasses to document_class format
107+
data = Document._normalize_to_document_class(data)
98108
return data
99109

100110
# Fallback for base
@@ -113,6 +123,21 @@ def read_blank_definition(json_file_location_string):
113123
f"Could not find definition for {json_file_location_string}"
114124
)
115125

126+
@staticmethod
127+
def _normalize_to_document_class(data):
128+
"""Convert flat schema format to MATLAB-compatible document_class format."""
129+
if "document_class" in data:
130+
return data
131+
class_name = data.pop("classname", "")
132+
superclasses = data.pop("superclasses", [])
133+
data["document_class"] = {
134+
"class_name": class_name,
135+
"property_list_name": class_name,
136+
"class_version": 1,
137+
"superclasses": superclasses,
138+
}
139+
return data
140+
116141
def dependency_value(self, dependency_name, error_if_not_found=True):
117142
if "depends_on" in self.document_properties:
118143
for dep in self.document_properties["depends_on"]:

src/did/implementations/sqlitedb.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,11 +261,11 @@ def _normalize_loaded_props(props):
261261
"""
262262
dc = props.get("document_class", {})
263263
sc = dc.get("superclasses")
264-
if isinstance(sc, dict):
264+
if sc is not None and not isinstance(sc, list):
265265
dc["superclasses"] = [sc]
266266

267267
dep = props.get("depends_on")
268-
if isinstance(dep, dict):
268+
if dep is not None and not isinstance(dep, list):
269269
props["depends_on"] = [dep]
270270

271271
return props

src/did/util/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
from .database_summary import database_summary as database_summary
2+
from .compare_database_summary import (
3+
compare_database_summary as compare_database_summary,
4+
)
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
"""Compare two database summaries and return a list of discrepancy messages.
2+
3+
Mirrors MATLAB's did.util.compareDatabaseSummary for cross-language symmetry testing.
4+
"""
5+
6+
7+
def compare_database_summary(summary_a, summary_b):
8+
"""Compare two summary dicts and return a list of mismatch descriptions.
9+
10+
Returns an empty list when the summaries are equivalent.
11+
"""
12+
report = []
13+
14+
branches_a = summary_a.get("branchNames", [])
15+
branches_b = summary_b.get("branchNames", [])
16+
17+
only_in_a = set(branches_a) - set(branches_b)
18+
only_in_b = set(branches_b) - set(branches_a)
19+
20+
for name in sorted(only_in_a):
21+
report.append(f'Branch "{name}" exists only in summary A.')
22+
for name in sorted(only_in_b):
23+
report.append(f'Branch "{name}" exists only in summary B.')
24+
25+
# Compare branch hierarchy
26+
hier_a = summary_a.get("branchHierarchy", {})
27+
hier_b = summary_b.get("branchHierarchy", {})
28+
common_branches = sorted(set(branches_a) & set(branches_b))
29+
30+
for branch_name in common_branches:
31+
if branch_name in hier_a and branch_name in hier_b:
32+
parent_a = hier_a[branch_name].get("parent", "")
33+
parent_b = hier_b[branch_name].get("parent", "")
34+
if parent_a != parent_b:
35+
report.append(
36+
f'Branch "{branch_name}": parent mismatch '
37+
f'("{parent_a}" vs "{parent_b}").'
38+
)
39+
40+
# Compare per-branch documents
41+
br_a = summary_a.get("branches", {})
42+
br_b = summary_b.get("branches", {})
43+
44+
for branch_name in common_branches:
45+
if branch_name not in br_a or branch_name not in br_b:
46+
continue
47+
48+
branch_a = br_a[branch_name]
49+
branch_b = br_b[branch_name]
50+
51+
if branch_a["docCount"] != branch_b["docCount"]:
52+
report.append(
53+
f'Branch "{branch_name}": doc count mismatch '
54+
f'({branch_a["docCount"]} vs {branch_b["docCount"]}).'
55+
)
56+
57+
map_a = {d["id"]: d for d in branch_a.get("documents", [])}
58+
map_b = {d["id"]: d for d in branch_b.get("documents", [])}
59+
60+
missing_in_a = sorted(set(map_b) - set(map_a))
61+
missing_in_b = sorted(set(map_a) - set(map_b))
62+
63+
for doc_id in missing_in_a:
64+
report.append(
65+
f'Branch "{branch_name}": doc "{doc_id}" missing in summary A.'
66+
)
67+
for doc_id in missing_in_b:
68+
report.append(
69+
f'Branch "{branch_name}": doc "{doc_id}" missing in summary B.'
70+
)
71+
72+
for doc_id in sorted(set(map_a) & set(map_b)):
73+
doc_a = map_a[doc_id]
74+
doc_b = map_b[doc_id]
75+
76+
if doc_a.get("className", "") != doc_b.get("className", ""):
77+
report.append(
78+
f'Branch "{branch_name}", doc "{doc_id}": class name mismatch '
79+
f'("{doc_a["className"]}" vs "{doc_b["className"]}").'
80+
)
81+
82+
props_a = doc_a.get("properties", {})
83+
props_b = doc_b.get("properties", {})
84+
85+
for field in ("demoA", "demoB", "demoC"):
86+
has_a = field in props_a
87+
has_b = field in props_b
88+
if has_a and has_b:
89+
val_a = (
90+
props_a[field].get("value")
91+
if isinstance(props_a[field], dict)
92+
else props_a[field]
93+
)
94+
val_b = (
95+
props_b[field].get("value")
96+
if isinstance(props_b[field], dict)
97+
else props_b[field]
98+
)
99+
if val_a != val_b:
100+
report.append(
101+
f'Branch "{branch_name}", doc "{doc_id}": '
102+
f"{field}.value mismatch ({val_a} vs {val_b})."
103+
)
104+
elif has_a != has_b:
105+
report.append(
106+
f'Branch "{branch_name}", doc "{doc_id}": '
107+
f'field "{field}" present in one summary but not the other.'
108+
)
109+
110+
# Compare depends_on
111+
deps_a = props_a.get("depends_on", [])
112+
deps_b = props_b.get("depends_on", [])
113+
if isinstance(deps_a, dict):
114+
deps_a = [deps_a]
115+
if isinstance(deps_b, dict):
116+
deps_b = [deps_b]
117+
norm_a = [
118+
(d.get("name", ""), d.get("value", ""))
119+
for d in deps_a
120+
if isinstance(d, dict)
121+
]
122+
norm_b = [
123+
(d.get("name", ""), d.get("value", ""))
124+
for d in deps_b
125+
if isinstance(d, dict)
126+
]
127+
if norm_a != norm_b:
128+
report.append(
129+
f'Branch "{branch_name}", doc "{doc_id}": depends_on mismatch.'
130+
)
131+
132+
return report

0 commit comments

Comments
 (0)