feat(packaging): ship notebooks in-package and fetch demo data from Zenodo#317
Open
daharoni wants to merge 4 commits into
Open
feat(packaging): ship notebooks in-package and fetch demo data from Zenodo#317daharoni wants to merge 4 commits into
daharoni wants to merge 4 commits into
Conversation
…enodo Move the pipeline and cross-registration notebooks into minian/notebooks/ so they ship inside the wheel, and pull demo data on demand from Zenodo instead of committing ~700 MB of binaries to the repo. - minian.data: pooch-backed, checksum-verified fetch() with a Zenodo registry (pipeline-demo, cross-reg-sessions) and a minian-data CLI - minian-notebooks CLI copies bundled notebooks out of the package; minian-install --notebooks/--demo kept as deprecated aliases - notebooks drop the sys.path hacks and reference assets/ relatively - demo_movies/ and demo_data/ binaries removed; stub READMEs point at the data registry - tests: discover and smoke-execute bundled notebooks; the golden pipeline and cross-reg runs are marked slow and resolve data via fetch() - CI: prefetch+cache demo-data job; default pytest deselects slow while the full matrix runs notebooks via test-all; lint verifies notebooks are output-stripped; pre-commit nbstripout hook - deps: add pooch Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add pooch to pdm.lock (it was added to dependencies for on-demand demo data fetching) so `pdm lock --check` passes, and clear the stale execution_count from pipeline.ipynb so the notebooks-stripped check passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test matrix runs on windows-latest, but prefetch-data only ran on ubuntu/macos and the cache paths omitted the Windows location, so Windows test jobs had no primed cache. Add windows-latest to the prefetch matrix and include the Windows pooch cache dir (AppData/Local/minian/minian/Cache) in both cache steps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This was referenced Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Removes the ~700 MB of demo binaries (
demo_movies/,demo_data/) from the repo and replaces them with on-demand fetching from Zenodo, and ships the tutorial notebooks inside the installed package.Changes
minian/data/- new fetch/cache API (fetch,dataset_path,datasets) backed bypooch, a per-file registry with SHA256 checksums (_registry.py), and aminian-dataCLI (list/download/path). Datasets are hosted on Zenodo (one deposit/DOI each) and verified on every access.MINIAN_DATA_DIRis an offline escape hatch.minian/notebooks/- the pipeline and cross-registration notebooks now live in-package as self-contained bundles (notebook +assets/+README.md).minian-notebooksCLI copies a bundle out to a working directory.minian/install.py- reduced to a thin back-compat shim delegating to the two new CLIs._notebook.pyhelper; heavy end-to-end runs are markedslow(deselected bypdm run test, run in full bytest-all/CI). Golden-value assertions unchanged.prefetch-datajob caches datasets once per OS;pdm-backendships notebooks in the wheel;nbstripoutpre-commit hook +notebooks-strippedCI keep committed notebooks output-free.scripts/-prefetch_data.py(CI cache priming) andzenodo_manifest.py(maintainer staging tool).Notes
_registry.py.