Skip to content

Fine grained breakdown of novelty/uniqueness & symprec sensitivity#85

Open
smglsn12 wants to merge 8 commits intomainfrom
feat/fine-grained-novelty
Open

Fine grained breakdown of novelty/uniqueness & symprec sensitivity#85
smglsn12 wants to merge 8 commits intomainfrom
feat/fine-grained-novelty

Conversation

@smglsn12
Copy link
Copy Markdown
Collaborator

Fine-grained novelty/uniqueness breakdown and symprec sweep

Summary

1. Fine-grained novelty and uniqueness breakdown

The novelty and uniqueness benchmarks now classify each novel/unique structure into three sub-categories beyond the top-level count:

Category Novelty (vs LeMat-Bulk reference) Uniqueness (within generated set)
Composition Composition absent from reference No other generated structure shares this composition
Spacegroup Composition exists in reference but no reference entry shares the spacegroup Composition shared, but (composition, spacegroup) pair is unique in the set
Structure only Composition + spacegroup match exists in reference, but fingerprint is unique (Composition, spacegroup) pair shared, but fingerprint is unique

New metric keys in results JSON:

  • novel_composition_count, novel_spacegroup_count, novel_structure_only_count
  • unique_composition_count, unique_spacegroup_count, unique_structure_only_count

Both metrics handle SpacegroupAnalyzer failures consistently: when the spacegroup cannot be determined, the structure is conservatively classified as "different spacegroup" (category 2).

Files changed:

  • src/lemat_genbench/metrics/novelty_metric.py — coded return values in compute_structure, on-demand reference spacegroup lookups with per-composition caching, updated aggregate_results
  • src/lemat_genbench/metrics/uniqueness_metric.py_get_comp_sg helper, _classify_unique_structures post-hoc classification, updated both BAWL and structure-matcher paths in compute
  • src/lemat_genbench/benchmarks/novelty_benchmark.py — propagates new counts in aggregate_evaluator_results
  • src/lemat_genbench/benchmarks/uniqueness_benchmark.py — propagates new counts in aggregate_evaluator_results

2. Explicit symprec=0.01 via module-level _SYMPREC

Every file that calls SpacegroupAnalyzer or Structure.get_space_group_info now reads from a module-level _SYMPREC = 0.01 variable instead of relying on pymatgen's implicit default. This provides:

  • Durability against pymatgen updates — if pymatgen changes its default symprec in a future release, our behaviour is pinned.
  • Runtime override — the symprec sweep script sets _SYMPREC on each module before running benchmarks.

Confirmed that 0.01 matches pymatgen's current default via inspect.signature(SpacegroupAnalyzer.__init__) and inspect.signature(Structure.get_space_group_info).

Files changed:

  • src/lemat_genbench/metrics/novelty_metric.py
  • src/lemat_genbench/metrics/uniqueness_metric.py
  • src/lemat_genbench/metrics/validity_metrics.py
  • src/lemat_genbench/metrics/diversity_metric.py
  • src/lemat_genbench/fingerprinting/crystallographic_analyzer.py
  • src/lemat_genbench/utils/oxidation_state.py
  • src/lemat_genbench/lemat_scraping/lematbulk_oxi_states.py
  • src/lemat_genbench/preprocess/augmented_fingerprint_preprocess.py
  • src/lemat_genbench/preprocess/distribution_preprocess.py
  • scripts/explore_novelty.py

3. Symprec sweep script

scripts/run_symprec_sweep.py runs validity, novelty, uniqueness, and diversity benchmarks across a configurable set of symprec values (default: [1e-5, 1e-3, 0.01, 0.1, 0.5]). Accepts the same --csv/--cifs input as run_benchmarks.py. All results are saved to a single JSON file in results_final/ keyed by symprec value.

uv run scripts/run_symprec_sweep.py --csv <csv_path> --name <name>

4. Novelty/uniqueness breakdown visualization

scripts/plot_novelty_uniqueness_breakdown.py produces two side-by-side bar charts from a single benchmark results JSON, showing the total count and per-category breakdown (composition / spacegroup / structure only) with percentages for both novelty and uniqueness. A dashed line marks the total structures evaluated.

uv run scripts/plot_novelty_uniqueness_breakdown.py results_final/<output_file>

5. Symprec sweep visualization

scripts/plot_symprec_sweep.py produces one subplot per symprec value, each with four bars: overall validity, overall novelty, overall uniqueness, and spacegroup diversity. Bar colors are consistent across subplots for easy comparison.

uv run scripts/plot_symprec_sweep.py results_final/<output_file>

Tests

  • uv run scripts/plot_symprec_sweep.py results_final/<sweep_output>.json — verify plot is generated
  • uv run scripts/plot_novelty_uniqueness_breakdown.py results_final/<output>.json — verify breakdown plot is generated
  • Existing tests pass: uv run pytest

@smglsn12 smglsn12 requested review from Ramlaoui and sid-betalol and removed request for sid-betalol April 17, 2026 22:02
@smglsn12
Copy link
Copy Markdown
Collaborator Author

Both metrics handle SpacegroupAnalyzer failures consistently: when the spacegroup cannot be determined, the structure is conservatively classified as "different spacegroup" (category 2).

@sid-betalol @Ramlaoui I'm actually not certian this is the best approach here, but it is what we've been doing seemingly and it's the easiest to standardize, so I've left it for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant