Fine grained breakdown of novelty/uniqueness & symprec sensitivity#85
Open
Fine grained breakdown of novelty/uniqueness & symprec sensitivity#85
Conversation
…p to test sensitivity
Collaborator
Author
@sid-betalol @Ramlaoui I'm actually not certian this is the best approach here, but it is what we've been doing seemingly and it's the easiest to standardize, so I've left it for now |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fine-grained novelty/uniqueness breakdown and symprec sweep
Summary
1. Fine-grained novelty and uniqueness breakdown
The novelty and uniqueness benchmarks now classify each novel/unique structure into three sub-categories beyond the top-level count:
New metric keys in results JSON:
novel_composition_count,novel_spacegroup_count,novel_structure_only_countunique_composition_count,unique_spacegroup_count,unique_structure_only_countBoth metrics handle
SpacegroupAnalyzerfailures consistently: when the spacegroup cannot be determined, the structure is conservatively classified as "different spacegroup" (category 2).Files changed:
src/lemat_genbench/metrics/novelty_metric.py— coded return values incompute_structure, on-demand reference spacegroup lookups with per-composition caching, updatedaggregate_resultssrc/lemat_genbench/metrics/uniqueness_metric.py—_get_comp_sghelper,_classify_unique_structurespost-hoc classification, updated both BAWL and structure-matcher paths incomputesrc/lemat_genbench/benchmarks/novelty_benchmark.py— propagates new counts inaggregate_evaluator_resultssrc/lemat_genbench/benchmarks/uniqueness_benchmark.py— propagates new counts inaggregate_evaluator_results2. Explicit
symprec=0.01via module-level_SYMPRECEvery file that calls
SpacegroupAnalyzerorStructure.get_space_group_infonow reads from a module-level_SYMPREC = 0.01variable instead of relying on pymatgen's implicit default. This provides:_SYMPRECon each module before running benchmarks.Confirmed that
0.01matches pymatgen's current default viainspect.signature(SpacegroupAnalyzer.__init__)andinspect.signature(Structure.get_space_group_info).Files changed:
src/lemat_genbench/metrics/novelty_metric.pysrc/lemat_genbench/metrics/uniqueness_metric.pysrc/lemat_genbench/metrics/validity_metrics.pysrc/lemat_genbench/metrics/diversity_metric.pysrc/lemat_genbench/fingerprinting/crystallographic_analyzer.pysrc/lemat_genbench/utils/oxidation_state.pysrc/lemat_genbench/lemat_scraping/lematbulk_oxi_states.pysrc/lemat_genbench/preprocess/augmented_fingerprint_preprocess.pysrc/lemat_genbench/preprocess/distribution_preprocess.pyscripts/explore_novelty.py3. Symprec sweep script
scripts/run_symprec_sweep.pyruns validity, novelty, uniqueness, and diversity benchmarks across a configurable set of symprec values (default:[1e-5, 1e-3, 0.01, 0.1, 0.5]). Accepts the same--csv/--cifsinput asrun_benchmarks.py. All results are saved to a single JSON file inresults_final/keyed by symprec value.4. Novelty/uniqueness breakdown visualization
scripts/plot_novelty_uniqueness_breakdown.pyproduces two side-by-side bar charts from a single benchmark results JSON, showing the total count and per-category breakdown (composition / spacegroup / structure only) with percentages for both novelty and uniqueness. A dashed line marks the total structures evaluated.5. Symprec sweep visualization
scripts/plot_symprec_sweep.pyproduces one subplot per symprec value, each with four bars: overall validity, overall novelty, overall uniqueness, and spacegroup diversity. Bar colors are consistent across subplots for easy comparison.Tests
uv run scripts/plot_symprec_sweep.py results_final/<sweep_output>.json— verify plot is generateduv run scripts/plot_novelty_uniqueness_breakdown.py results_final/<output>.json— verify breakdown plot is generateduv run pytest