Conversation
… color The .anndata-text--warning rule was incorrectly removed during cleanup but is still applied in registry.py. The .anndata-dtype--ndarray class was defined in constants and used by formatters but never had a CSS rule, falling through unstyled. It now shares the --array color variable.
Keep both copy_on_write_X setting from main and HTML repr settings from html_rep branch.
Two-tier detection: tier 1 uses the canonical has_xp() protocol check from anndata.compat (catches JAX, numpy >=2.0); tier 2 falls back to duck-typing (shape/dtype/ndim) for arrays that don't yet implement the full protocol (PyTorch, TensorFlow). Also uses __array_namespace__() for backend label resolution and updates stale PR scverse#2063 → scverse#2071.
… arrays Device info (cuda:0, tpu:0, GPU:0, etc.) is now shown inline in the type column instead of being hidden in tooltips. Adds visual inspection test 26.
…ce-based coloring
CuPy ≥12 implements the full Array API protocol, so the dedicated formatter
was redundant. ArrayAPIFormatter now handles CuPy arrays (with GPU:{device.id}
for clean labels) and colors all array-api arrays by device type: GPU green,
TPU teal, CPU/other amber — uniformly across backends.
Also removes unused CSS_DTYPE_ARRAY constant and its CSS selector.
|
Hi, thanks for all the work! Quite a bit smaller now, but +22k still fills me with dread. (I know a lot is tests)
Hmm, I’ll take a look, but I’m not sure I want to risk it existing.
I think you missed that you can instead wrap safe markup in
That makes no sense to me. Using a theme setting if we can detect one and defaulting to the OS setting if we can’t is the best we can do, so why not do that?
Huh, didn’t know that’s possible, I recommended nesting mainly for descendant selectors ( I’m pretty happy with the state of the CSS now, but just FYI, there’s options:
But as said, just some pointers, no need to put a lot of work into that, the CSS looks fine!
We could use a
Link for future reference: w3c/css-validator#431
First,
Just make it return a
what’s that, do you have a link? |
…move markdown parser - Add @media (prefers-color-scheme: dark) as OS-level fallback (Tier 1), explicit light selectors (Tier 2) override when app is in light mode, existing dark selectors (Tier 3) unchanged. CSS variables defined once in css.py with placeholder substitution to avoid duplication. - Make TypeFormatter generic via PEP 695 (TypeFormatter[T]). can_format() returns TypeGuard[T], format() receives narrowed type without manual casts. Duck-typed formatters use TypeFormatter[object] with type: ignore. - Remove markdown-parser.js (6.7KB) and markdown.py. README content shown as plain text via <pre> + textContent (XSS-safe). Remove ~110 lines of markdown-specific CSS. - Add w3c/css-validator#431 reference where CSS nesting validation is skipped. - Update visual_inspect_repr_html.py descriptions for plain-text README.
|
@flying-sheep Thanks for the thorough review! Here's what we've addressed and where we landed on the discussion points. Changes madeDark mode CSS. Added TypeGuard for README as plain text. Removed the JavaScript markdown parser (
CSS validator link. Added reference to w3c/css-validator#431 where we skip CSS nesting validation errors. On Jinja / markupsafeThanks for the
Current approach. Explicit I think the current approach is the right fit for this architecture, but happy to discuss further. |
On the PR size (~22K lines)Happy to discuss what could be simplified or split. Here's an honest breakdown of where the lines went. Summary
Tests and test tooling account for 61% of the PR. The implementation itself is ~8,150 lines. Source code breakdownformatters.py (1,172 lines) — 20 type-specific formatters covering ndarray, masked arrays, sparse matrices, backed sparse, DataFrames, Series, categoricals, lazy columns, dask, awkward, array-API/CuPy, nested AnnData, None, bool, int, float, str, dict, color lists, and generic list/tuple. Each formatter is ~50 lines average, with the larger ones (categorical, array-API) handling color swatches, device info, and dtype CSS classes. This is the primary extension point for ecosystem packages. registry.py (1,044 lines) — The plugin system. Bulk comes from: utils.py (790 lines) — Shared helpers: serialization checking via the IO registry, value preview generation (dicts, lists, strings with truncation), color detection and CSS sanitization (whitelist-based, blocks injection), HTML escaping, memory formatting, key validation. The color sanitization alone is ~60 lines because it validates against CSS named colors, hex, rgb(), and hsl() while blocking url(), expression(), and semicolons. html.py (637 lines) — The entry point. Orchestrates header (shape, badges, README icon, search), section rendering loop, footer (version, memory), and wraps everything with scoped CSS/JS. Handles settings capture, container ID generation, and the overall HTML structure. components.py (618 lines) — Reusable UI components: section headers with fold/expand, entry rows with name/type/preview columns, badges, warning icons, copy buttons, search box. These are the building blocks that ecosystem packages can use directly. sections.py (563 lines) — Section renderers for obs/var DataFrames (with column width calculation), mapping sections (obsm, varm, obsp, varp, layers), uns (recursive dict traversal with depth limit), and raw. init.py (468 lines) — Public API with core.py (401 lines) — Shared rendering primitives: format_number (with comma grouping), table rendering for DataFrame expansion, and entry rendering coordination between formatters and HTML output. lazy.py (346 lines) — Lazy AnnData support. Detects lazy mode, reads partial categories from disk without triggering full materialization, determines column dtypes from storage metadata. Wrapped in try/except with graceful fallback. css.py (97 lines) — CSS loader with dark/light variable placeholder substitution (define color blocks once, substitute into both javascript.py (49 lines) — JS loader. Static assetsrepr.css (1,050 lines) — Scoped CSS with native nesting. Covers: layout grid, section headers, entry rows, type column with dtype-specific colors (12 dtype classes), dark mode (three-tier: OS media query, explicit light override, dark theme selectors for Jupyter/Sphinx), README modal, search box, fold/expand animations, badges, warning/error styling, color swatches, copy buttons, scrollable containers. All scoped under repr.js (509 lines) — Fold/expand toggle, search with regex support and toggle buttons, copy-to-clipboard, README modal with keyboard accessibility, wrap-mode toggle for long type strings, ResizeObserver for responsive layout. css_colors.txt (197 lines) — CSS named colors for TestsAverage test is 16 lines. Tests are split by concern:
test_repr_robustness.py (1,493 lines) is the largest because it covers 72 edge cases: escaping at every user-data insertion point (probe-based, not attack-vector-based), unicode handling, crashing objects, circular references, size limits, concurrent access, and error accumulation. These are intentionally thorough because Test infrastructurehtml_validator.py (836 lines) — Regex-based HTML validator with structured assertions ( conftest.py (272 lines) — Shared fixtures: AnnData factories for various configurations, the Visual inspection harnessvisual_inspect_repr_html.py (3,365 lines) — Generates an HTML page with 26+ scenarios for manual review. Not a pytest test. Includes: basic/empty/view AnnData, lazy mode, backed mode, deep nesting, many categories, custom sections (TreeData/MuData/SpatialData mocks), README modal, adversarial data, ecosystem extensibility demos. The HTML template itself is ~2,200 lines (inline CSS for the test page layout, accordion sections, checklists). This could live in a separate repo or as a notebook, but having it adjacent to the code makes it easy to regenerate during development. What could be reduced?Genuinely open to suggestions. Some candidates:
None of these would change the order of magnitude. The feature has genuine breadth: 20 type formatters, a plugin registry, 11 configurable settings, dark mode, lazy mode support, serialization warnings, and search. For comparison, pandas' The test-to-code ratio of 1.7:1 reflects a deliberate choice: |
The expanded raw subsection now displays index previews matching the main AnnData header, with graceful "not available" fallback when indices are absent or inaccessible.
Upstream added `size: int` to `SupportsArrayApi`, causing `has_xp()` to reject the mock and `coerce_array` to raise.
|
Hi! I’m sorry, I wanted to review that again earlier. I’ll take time early next week, but here are a few things already:
Yeah, I’m sorry if I misled you but the way to actually fix this exists: replace media queries with light-dark(…). It actually responds to the used color scheme. The used color scheme defaults to So the CSS should look like this (absolutely no duplication required): .anndata-repr {
body.light-mode &
[data-theme="light"] &,
[data-jp-theme-light="true"] &,
.jp-Theme-Light &,
body.vscode-light &,
body[data-vscode-theme-kind="vscode-light"] & {
color-scheme: dark;
}
body.dark-mode &
[data-theme="dark"] &,
[data-jp-theme-light="false"] &,
.jp-Theme-Dark &,
body.vscode-dark &,
body[data-vscode-theme-kind="vscode-dark"] & {
color-scheme: light;
}
--anndata-bg-primary: light-dark(#ffffff, #1e1e1e);
...
}
Wait, you’re using tables to layout things? I think that was already considered problematic in the 2010s! I think using |
Replace the three-tier dark mode system (@media queries + explicit light/dark selectors with Python string substitution) with CSS light-dark() and color-scheme. Each color variable is now defined once, the Python-side placeholder replacement in css.py is removed, and theme selectors simply set color-scheme: light/dark.
Replace table-based layout with CSS grid + subgrid for regular entries and native <details>/<summary> for expandable entries, eliminating JS expand/collapse logic entirely. The whole entry row acts as the <summary> toggle with a subtle arrow indicator in the preview column. Fix name column width calculation to account for CSS grid border-box sizing (column width includes cell padding, unlike table content-box).
|
Thanks for the follow-up, and no worries on timing!
|
| Feature | Chrome | Firefox | Safari |
|---|---|---|---|
light-dark() |
123+ (Mar 2024) | 120+ (Nov 2023) | 17.5+ (May 2024) |
| CSS nesting | 120+ (Dec 2023) | 117+ (Oct 2023) | 17.2+ (Dec 2023) |
subgrid |
117+ (Sep 2023) | 71+ (Dec 2019) | 16+ (Sep 2022) |
<details>/<summary> |
everywhere | everywhere | everywhere |
The HTML output passes Nu Html Checker (vnu) W3C validation, run in CI when vnu is installed. vnu covers everything except native CSS nesting (w3c/css-validator#431), so CSS parse errors are filtered out.
- Add missing CSS rule for wrap button expansion on preview cells - Remove dead TypeCellConfig.has_expandable_content field - Fix zebra striping to skip hidden entries during search filtering - Add subgrid to CSS browser compat header comments - Fix inaccurate DOM structure comment in JS - Align DEFAULT_FIELD_WIDTH_PX with MIN_FIELD_WIDTH_PX (104)
675.feature.md → 2236.feat.md
|
Great! I think this is really coming along, thank you for your patience! I think one thing @ilan-gold said early is that he’d basically accept one of two approaches:
@ilan-gold did I paraphrase that correctly? @katosh here are some pointers in case you want to play around with rendering a big JSON tree in jinja:
No, a jinja version of this would contain no (or almost no) string manipulation and just pass data into jinja. The idea would be an inversion of trust: marking things as safe would be explicit, wherever that doesn’t happen is automatically treated as unsafe and escaped.
Not really, ideally there wouldn’t be much Python left, the idea was that the AnnData object would be turned into a simple render-ready JSON-like data structure (
Yup! The idea would be that they’d override some jinja blocks in some of the templates, and do something like this (just an example how to conditionally add a sub-template, not necessarily the ideal structure): {% extends "anndata.html" %}
{% block attribute %}
{% if attr_name != "tem" %}
{{ super() }}
{% else %}
{% include "tem.html" %}
{% endif %}
{% endblock %}another option would be to add some special casing where people could register new attributes directly to replace boilerplate like the above, at the expense of added complexity (that would probably need a jinja filter or so?). |
I tend to think this is the way to go. Let's not bite off more than we need to here. I genuinely don't have a good grasp on what the use-case is here in strong terms - MuData already has its own renderer for example.
Right, and this could build off of the work in #2290 and extend the JSON schema there. I would also go for a less-feature complete but more robust version of a JSON schema. For example, I know that categories can get big, but I think we should not worry about that. That is a v2 feature. |
|
Thanks for the detailed proposal in your latest comments. I've mapped each feature onto the TypedDict + Jinja architecture to understand what transfers and what doesn't. To help navigate, here's where I address each of your points:
I've linked to relevant earlier comments throughout. Some points below build on arguments from earlier in the thread — I'd find it most productive if we can engage with those discussions rather than revisiting them from scratch. Before diving in: the instinct behind TypedDict + Jinja is architecturally sound in the general case — separating data from presentation, defaulting to auto-escaping, enabling JSON-serializable intermediates. If the rendering layer were the complex part of this system, I'd agree templates are the right tool. But in this system, the complexity lives in the Python formatting layer — type dispatch, error recovery, context-dependent decisions — which survives a Jinja migration unchanged. Jinja replaces the rendering layer, which is the simpler part. I want to walk through that concretely rather than assert it. The core tradeoff@flying-sheep outlined two options: (1) TypedDict + Jinja with extensibility built around it, or (2) leave out extensibility for now. @ilan-gold favors option 2:
I understand the core concerns here are maintainability and security — you'll be maintaining this code long-term and are responsible for ensuring it's safe. I share those goals. But as I'll show below, TypedDict + Jinja does not deliver the improvements in robustness and maintainability it appears to promise, and introduces a new maintenance cost at the Python/Jinja boundary. The key thing to surface is that this isn't just about deferring extensibility — TypedDict + Jinja is architecturally at odds with it. The features that require extensibility (ecosystem custom HTML, per-type dispatch) can't be expressed in a fixed TypedDict schema without falling back to So the real question is: do we want extensibility? But first, since the proposal seems to assume there's no structured intermediate representation, let me recap the architecture so we're working from the same mental model. How the current architecture worksThe PR doesn't go from object to HTML in one step. There are three layers:
We already have separation of concerns. The question is whether the rendering half should be written in Python or in Jinja — not whether separation exists. TypedDict + Jinja would replace layers 2 and 3, but the complexity doesn't live there. It lives in the formatters (~2,200 lines across What maps cleanly to TypedDict + JinjaThese features are fully compatible with a JSON-serializable intermediate representation — roughly 60-70% of the visual output:
What a Jinja migration must reimplementThe remaining features CAN be expressed as TypedDict fields — but the formatting logic that populates those fields requires Python features that can't move into Jinja templates. With TypedDict + Jinja, this logic must be reimplemented as a Python "crawl phase" that does the same work as the current formatters.
None of this logic goes away — it's reimplemented targeting TypedDict output instead of The maintenance cost at the boundaryIn addition, TypedDict + Jinja introduces a new maintenance cost that the current system doesn't have: a dual-contract boundary between the crawl phase and the template. In the current system, With TypedDict + Jinja, this changes. The TypedDict carries unresolved data — nullable fields for each attribute, raw category lists, error sentinels. The template must handle every combination with its own conditionals: {% if entry.shape is not none %}({{ entry.shape|join(', ') }}){% endif %}
{% if entry.dtype %} {{ entry.dtype }}{% endif %}
{% if entry.error %}<span class="warning">⚠ {{ entry.error }}</span>{% endif %}
{% if entry.colors %} {# render swatches #} {% endif %}That's two layers that must agree on: what fields are nullable, what null means, how partial results compose. A change to what the crawl phase produces can silently break the template, with no compile-time check across the Python/Jinja boundary. Mypy checks the TypedDict definition in Python; it cannot check that the template handles every nullable combination correctly. This is in tension with the typing rigor we've established elsewhere in this PR — the strict typing that motivated removing This compounds with JSON export. Adding a JSON consumer to the same TypedDict creates a third site that must handle the same combinatorial space of nullable fields — crawl, template, JSON serializer — all implementing their own conditional logic for the same partial-failure scenarios, all kept in sync manually.
Concretely, when
This is the opposite of reduced maintenance burden. The current system resolves ambiguity once, in the formatter. TypedDict + Jinja defers it to every consumer. What TypedDicts structurally preventUnlike the items above, these features are genuinely incompatible with a fixed TypedDict schema — not because of implementation effort, but because of structural limitations in how Jinja2 extensibility works.
These aren't features that could be added later on top of TypedDict + Jinja. The only escape hatch is Why extensibility matters@ilan-gold, you mentioned:
Here are the concrete cases: Discoverability of analysis results. Ecosystem tools store results across multiple AnnData slots, but there's no way for a user to see what was computed or which tool put it there — they just see generic arrays and columns. Our package kompot writes DE results to Reusable components for MuData and SpatialData. Early in this PR, @Zethson asked for exactly this:
The README rendering for collaborators. When sharing AnnData files between lab members, it's common to store a description in The extensibility API has been in this PR for months and is covered by 607 tests (108 adversarial). If there are specific maintainability or correctness concerns, I'd like to understand them so I can address them concretely. If you're concerned about API lock-in, Option B below keeps the API internal while preserving the architecture that makes it possible. On Jinja and securityI understand this is framed primarily as a security question, and I want to engage with that directly. I evaluated template-based architectures early on and explained this reasoning in detail. Let me revisit it in light of the specific proposal. The security argument for Jinja is: auto-escaping by default means a contributor can't accidentally forget to escape user data, preventing XSS from maliciously crafted AnnData files. That's a real concern, and I take it seriously. But let's be precise about the threat model and what Jinja actually changes. The threat is narrow. The attack surface is: an attacker crafts an AnnData file with malicious strings (e.g., Jinja's advantage is real but bounded. The failure mode asymmetry is genuine: forgetting The cost is disproportionate to the security gain. This improvement in default escaping for internal rendering comes at the cost of: a new dependency, a cross-language boundary with unchecked contracts (see Maintenance cost at the boundary above), and structural barriers to extensibility (see What TypedDicts structurally prevent above). And for ecosystem extensions that produce custom visualizations, the escaping responsibility moves to third-party code — Jinja provides no safety improvement there. Ecosystem extensions reintroduce the risk. If extensibility is supported, ecosystem packages would supply their own templates or generate HTML for custom visualizations. The escaping responsibility shifts to code outside anndata's control. But more fundamentally, ecosystem packages already run arbitrary Python in the user's process — a malicious or buggy package can execute code, access the filesystem, or exfiltrate data, none of which is constrained by HTML escaping. XSS in a formatter is a strictly lesser risk than what ecosystem code can already do. Jinja's auto-escaping on anndata's side doesn't change this threat model. CSS injection isn't addressed by Jinja either. Category color values from On robustness and scope
and
I want to address both the scope concern and the robustness expectation. On review burden: The PR is large, and I understand that reviewing +22K lines is daunting. As I broke down earlier, 41% of those lines are tests, 15% is the visual test harness, and 8% is static assets (CSS/JS). The actual source code is ~29% (~6.4K lines). Dropping extensibility ( On robustness: The expectation of improved robustness from TypedDict + Jinja is misleading. The logic where robustness matters must be reimplemented in a crawl phase regardless (see above), and the boundary between crawl and template replaces a single-contract system with a dual-contract system — adding a maintenance surface, not removing one. I agree that dropping the extensibility API reduces scope — that's Option B below, and I'm happy to go that route. But the robustness question remains: is TypedDict + Jinja more robust than f-strings for the code that stays? The internal formatting logic (type dispatch, For context on the rendering approach: xarray's repr uses f-strings and Jinja was never considered in that project's design discussion. Dask did migrate to Jinja (dask#8019), but for a different use case — Dask renders one known type per repr call (one template per type: On JSON exportJSON export is a valuable goal and I'm in favor of it. But rather than motivating a Jinja migration, JSON export highlights the cost of TypedDict + Jinja. As discussed in Maintenance cost at the boundary, TypedDict + Jinja creates a dual-contract system where the crawl phase produces unresolved data and the template handles nullable combinations. Adding a JSON consumer to the same TypedDict creates a triple-contract system — three sites implementing conditional logic for the same partial-failure scenarios, kept in sync manually. With the current system, adding JSON export means adding a serialization method to There's also a schema mismatch. The HTML path truncates and summarizes: I raised several design questions in my earlier detailed response that I'd like to resolve before designing the schema:
I'd appreciate engagement on these questions — they need to be resolved regardless of which rendering architecture we choose. Path forwardI think there are three reasonable options for the rendering architecture, plus JSON export as a separate follow-up: Option A: Merge with extensibility API. The Option B: Merge without extensibility API. I remove Option C: Adopt TypedDict + Jinja, strip extensibility. Replace JSON export can be added as a follow-up to any option above, once the design questions above are resolved. My recommendation is A (or B as a compromise on review scope). I believe the current architecture provides the foundation for both extensibility and JSON export without the costs of a Jinja migration. I've created a visual side-by-side comparison (gist source) showing what each approach can express for the features discussed above — basic layout, category colors, error recovery, ecosystem custom HTML, and the maintenance cost of adding JSON export. I want to make sure we're making this decision on a shared understanding of the implementation. If there are specific parts of the code that feel hard to maintain or that raise security concerns, I'd welcome those pointers — they'd help me improve the implementation regardless of which direction we go. |

Rich HTML representation for AnnData
Summary
Implements rich HTML representation (
_repr_html_) for AnnData objects in Jupyter notebooks. Builds on previous draft PRs (#784, #694, #521, #346) with a complete, production-ready implementation.Live Demo | Reviewer's Guide (technical details, design decisions, extensibility examples)
Screenshot
Features
Interactive Display
.rawsection showing unprocessed data (Reportn_varsof.rawin__repr__#349)Visual Indicators
unspalettes (e.g.,cell_type_colors)unsvaluesuns["README"])Serialization Warnings
Proactively warns about data that won't serialize:
/(deprecated)Compatibility
.anndata-reprprevents style conflictsread_lazy()(categories, colors)Extensibility
Three extension mechanisms for ecosystem packages (MuData, SpatialData, TreeData):
obst/vart,mod)See the Reviewer's Guide for examples and API documentation.
Testing
python tests/visual_inspect_repr_html.pyRelated
sparse_datasetby removingscipyinheritance #1927 (sparse scipy changes), feat: array-api compatibility #2063 (Array-API)Acknowledgments
Thanks to @selmanozleyen (#784), @gtca (#694), @VolkerH (#521), @ivirshup (#346, #675), and @Zethson (#675) for prior work and discussions.
Technical Notes and Edits
Lazy Loading
Constants are in
_repr_constants.py(outside_repr/) to prevent loading ~6K lines onimport anndata. The full module loads only when_repr_html_()is called.Config Changes
pyproject.toml: Addedvartto codespell ignore list (TreeData section name).Edit (Dec 27, 2024)
To simplify review and reduce the diff, I've merged settylab/anndata#3 into this PR. That PR was originally created as a follow-up to explore additional features based on the discussion with @Zethson about SpatialData/MuData extensibility.
What changed:
.rawsection - Expandable row showing unprocessed data (Reportn_varsof.rawin__repr__#349)Edit (Jan 4, 2025)
Moved detailed implementation documentation (architecture, design decisions, extensibility examples, configuration reference) to the Reviewer's Guide to keep this PR description focused on features.
Code refactoring:
html.pyinto focused modules for maintainabilitycomponents.py(badges, buttons, icons)sections.py(obs/var, mapping, uns, raw)core.py(avoids circular imports)utils.pyFormatterContextconsolidates all 6 rendering settings (read once at entry, propagated via context)html.pyreduced from ~2100 to ~740 lines, clean import hierarchyNew features:
read_lazy()AnnData objects (experimental) - indicates when obs/var are xarray-backed(lazy)indicator on columnsBug fixes:
adata-text-mutedclass for uniform appearanceRelated issue discovered:
read_lazy()returns index values as byte-representation strings (e.g.,"b'cell_0'"instead of"cell_0") - seeISSUE_READ_LAZY_INDEX.mdEdit (Jan 6, 2025)
Smart partial loading for
read_lazy()AnnData:Previously, lazy AnnData showed no category previews to avoid disk I/O. Now we do minimal, configurable loading to get richer visualization cheaply: only the first N category labels and their colors are read from storage (not the full column data). New setting
repr_html_max_lazy_categories(default: 100, set to 0 for metadata-only mode).Visual tests reorganized: 8 (Dask), 8b (lazy categories), 8c (metadata-only), 9 (backed).
Edit (Jan 6, 2025 - continued)
FormattedOutput API and architecture:
Clean separation between formatters and renderers - formatters inspect data and produce complete
FormattedOutput, renderers only receiveFormattedOutput(never the original data).The
FormattedOutputdataclass fields were renamed to be self-documenting:meta_contentpreview(text) orpreview_html(HTML)html_content+is_expandable=Trueexpanded_htmlhtml_content+is_expandable=Falsepreview_htmlis_expandableexpanded_html is not Nonetype_htmltype_namevisually)Naming convention:
*_htmlsuffix indicates raw HTML (caller responsible for escaping), plain text fields are auto-escaped.UI/UX improvements:
▼/▲arrows instead of⋯/▲for consistencyEdit (Jan 7, 2025)
Test architecture overhaul:
Tests reorganized from a single file into 10 focused modules for maintainability and parallel execution:
test_repr_core.pytest_repr_sections.pytest_repr_formatters.pytest_repr_ui.pytest_repr_warnings.pytest_repr_registry.pytest_repr_lazy.pytest_html_validator.pyHTMLValidator class (
conftest.py) provides structured HTML assertions:Key features: regex-based (no dependencies), section-aware matching, exact attribute matching to avoid "obs" matching "obsm".
Optional strict validation when dependencies available:
validate_html5()- W3C HTML5 + ARIA (requiresvnu)validate_js()- JavaScript syntax (requiresesprima)Jupyter Notebook/Lab compatibility tests (13 new tests in
TestJupyterNotebookCompatibility):Validates CSS scoping, JavaScript isolation, unique IDs across multiple cells, and Jupyter dark mode support.
Bug fix:
readme-modal-titleID is now unique per container to prevent ID collisions when multiple AnnData objects are displayed in the same notebook.Edit (Jan 8, 2025)
Maintainability improvements:
_render_entry_rowandrender_formatted_entryto eliminate duplicationget_formatter_for()andlist_formatters()methods to FormatterRegistry__init__.pystatic/directorytests/repr/html_validator.pymodule (conftest.py: 960→270 lines)_repr_constants.pyrender_entry_type_cell()signaturelazy.pymodulestatic/css_colors.txtfor easy updatesFile structure changes:
API simplifications:
render_entry_type_cell()now acceptsTypeCellConfigdataclass instead of 10 individual parametersis_lazy_adata(),is_lazy_column(),get_lazy_categories(),get_lazy_categorical_info()importlib.resources.files()(Python 3.9+)Edit (Jan 9, 2025)
Robustness & escaping coverage testing:
Added 108 tests in
test_repr_robustness.pyacross 14 test classes:html.escape()is called at every user-data insertion point using a<b>MARKER</b>probe__repr__,__len__,__sizeof__, properties)Escaping tests trust
html.escape()(stdlib) and only verify it's called at every insertion point, rather than exercising the escaping mechanism itself with attack vectors.Test cleanup:
Removed redundant and overly-specific tests to focus on meaningful coverage. Tests now verify behavior that matters (e.g., XSS escaped, errors visible, truncation applied) rather than testing identical code paths multiple times.
Visual inspection: Consolidated to 26 scenarios with single comprehensive "Evil AnnData" test combining all adversarial patterns.
Fixes:
repr_html_max_readme_sizeto_settings.pyitype stubspytest.warnsfor expected warnings)Updated stats:
Edit (Jan 16, 2025)
Error handling consolidation:
Refactored error handling to use a single
errorfield inFormattedOutputinstead of separateis_hard_errorparameters scattered across the codebase.Key changes:
FormattedOutputerror: str | Nonefield with documented precedence overpreview/preview_htmlFallbackFormatterFormatterRegistry.format_value()render_formatted_entry()is_hard_errorparam, now detects viaoutput.error_validate_key_and_collect_warnings()(key_warnings, is_key_not_serializable)- key issues mark as not serializable, preserving previewError vs Warning separation:
output.error: Hard rendering failure - row highlighted red, error message replaces previewoutput.is_serializable=False: Serialization warning - red background, but preview preservedNew behavior when formatters fail:
This prevents long error messages from appearing in HTML while preserving full details in warnings for debugging. Serialization issues (like non-string keys, lambdas, custom objects) preserve the value preview while showing the reason in the tooltip.
Updated stats:
Edit (Jan 26, 2025)
Review response changes (addressing @flying-sheep's review):
Typing:
Any→objectReplaced all ~95 uses of
Anyacross 7 files. Formatter method signatures now useobj: objectsince AnnData'sunsaccepts genuinely arbitrary objects and formatters handle AnnData-like objects (e.g., MuData) via duck typing.dict[str, Any]with known structure replaced with precise union types.CSS: Native nesting + dark mode + variable dedup
repr.cssto native CSS nesting (&). Selector repetitions of.anndata-reprreduced from 173 to 13. File length unchanged (~1164 lines) because the feature surface is genuinely large (~68 component blocks, 14 dtype colors, copy button, README styling, state variants), not because of repetition.[data-theme="dark"]for Furo/sphinx-book-theme) alongside existing Jupyter/VS Code detection.@media (prefers-color-scheme: dark)block and theme-selector block.&--variant) produce invalid CSS at nesting depth 2+ (browser treats&as:is(parent child), so&--viewbecomes:is(.anndata-repr .anndata-badge)--view). 7 modifier rules flattened to sibling selectors.Security tests simplified
Replaced ~34 attack-vector-heavy tests with 12 focused escaping-coverage tests. Each test puts a
<b>MARKER</b>probe at one user-data insertion point and verifies it appears escaped. RemovedTestCSSAttacks,TestEncodingAttacks; trimmedTestBadColorArrays,TestEvilReadme; consolidatedTestUltimateEvilAnnDatato 1 test. Total: 108 tests (14 classes), down from 123 (16 classes).Other:
FormatterContext.column_namerenamed toFormatterContext.keyFormatterRegistry.format_value()Future-Proofing: Related PRs and Issues
This PR includes explicit handling and/or code references to track compatibility with several in-progress or future changes. The following PRs/issues may trigger updates to the
_reprmodule:Already Handled
_reprSparseMatrixFormatteruses duck typing fallbackformatters.py:242,260,307ArrayAPIFormattervia duck typingformatters.py:771,1135ArrayAPIFormatterMay Require Updates When Merged
LazyCategoricalDtypeAPICategoricalArrayinternalslazy.py(all functions)obsformatters.py:159Recommended Post-Merge Actions
When feat: add
LazyCategoricalDtypefor lazy categorical columns #2288 merges:CategoricalFormatterandlazy.pyto use the newLazyCategoricalDtypeAPIget_lazy_categorical_info()extracts category count by manually navigatingobj.variable._data.array— replace withdtype.n_categoriesanddtype.head_categories(n)isinstance(dtype, LazyCategoricalDtype)for cleaner detectionWhen Add support for lists in obs #1923 is resolved:
_check_series_serializability()informatters.pyto recognize list-of-strings as serializableWhen feat: allow gpu io in
sparse_datasetby removingscipyinheritance #1927 merges:SparseMatrixFormatterstill works with new sparse array classesis_sparse()utility or the new classes have a stable API, the duck typing incan_format()(checking fornnz,tocsr,tocsc) could be simplified to direct type checksWhen feat: array-api compatibility #2063/feat: support array-api #2071 stabilize:
ArrayAPIFormatterduck typing (shape/dtype/ndim) follows the Array API standard and is the correct approachis_array_api_compatible(), could use that instead of manual attribute checks"cubed": "Cubed"toknown_backendsdict inArrayAPIFormatterfor prettier display labelsInternal API Usage Inventory
Current patterns accessing internal/private APIs that may be replaceable:
lazy.py:_get_categorical_array()col.variable._data.arrayisinstance(dtype, LazyCategoricalDtype)lazy.py:get_lazy_category_count()CategoricalArray._categories["values"].shape[0]dtype.n_categorieslazy.py:get_lazy_categorical_info()._categories,._ordereddtype.n_categories,dtype.orderedlazy.py:get_lazy_categories()read_elem_partial()on private._categoriesdtype.head_categories(n)lazy.py:is_lazy_adata()obs.__class__.__name__ == "Dataset2D"SparseMatrixFormatter.can_format()nnz,tocsr,tocscArrayAPIFormatter.can_format()shape,dtype,ndimBackedSparseDatasetFormatter.can_format()formatattr