Replies: 6 comments 21 replies
-
|
Thanks for sharing a very well thought of architecture for a polyglot persistence layer for FHIR, Steve. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you so much for putting together the document, it was a very interesting read! I have a few design questions and some minor editorial notes. Design Question & Suggestions:
Minor notes:
|
Beta Was this translation helpful? Give feedback.
-
|
I would like to initiate a discussion about SQL (Postgres) vs. S3 vs. hybrid storage models and where it would fit the documented polyglot. A question beforehand: The document mentions S3/Object Storage in connection with Parquet. Am I understanding correctly that this mostly refers to an analytical use where the data is mapped to columnar models (prob. following SQL-on-FHIR semantics; or the golden layer in the medallion architecture)? And not so much about primary data store? Some time ago we discussed an aproach where one would use SQL for indexing and search, while putting the full resource content in S3. While this brings its own challenges (single small file writes to S3 are expensive), where would such approach fit in the polyglot architecture? |
Beta Was this translation helpful? Give feedback.
-
Exploring unstructured data supportI took a closer look at the Feature Support Matrix + the hfs repo. Right now the matrix lists S3/Parquet as a read-only export backend, and the code seems to match that: the SQL-on-FHIR pipeline is focused on transforming FHIR resources into tabular outputs (CSV, JSON, NDJSON, Parquet), with cloud storage support for S3 / GCS / Azure Blob. That all looks geared toward bulk export + analytics, not using object storage as a primary persistence layer for large binaries. I also didn’t see anything around storing/indexing “unstructured” payloads like DICOM imaging artifacts. Given how relevant imaging is becoming for clinical AI, I think it could be worth discussing what “unstructured data support” would look like in HFS. A few ideas to seed the conv:
Is there already a plan for unstructured data support? Curious if this fits the roadmap, and what design patterns folks think make sense for a more “polyglot” FHIR persistence layer. |
Beta Was this translation helpful? Give feedback.
-
|
@smunini woah, this is a super ambitious project! I agree separating compute / storage is critical. As an example, I work primarily in DBX which enables comprehensive audit info (SOC 2 folks are thrilled) in a way that doesn't affect query response time. DBX audit logs relies on browser resources, with separate compute for the query. Contrast this with SQL Server, where something like Spotlight can make the query exp dog slow. A few (possibly whacky) ideas this prompted:
I'm a Rust noob, but fun to read and learn a bit of the syntax. |
Beta Was this translation helpful? Give feedback.
-
|
I want to personally thank everyone who engaged with this discussion, @lschmierer, @dougc95, @sandhums, @dr00b. Your comments really helped me with my thinking about this important addition to the project. I'm happy to share that an initial PR implementing many of the ideas from this discussion is now available: #30. This will be included in an upcoming release I'm working on shortly. What's in the PR: The PR introduces the helios-persistence crate — a comprehensive FHIR persistence layer with polyglot storage support. Here's what it includes:
What's ahead: We'll be expanding database backend support over time. Beyond persistence, we're moving on to other key roadmap items, notably authentication and authorization, which will be a major focus in upcoming work. Look for a new discussion document to be posted here on GitHub soon on that topic. As always, feedback and contributions are welcome! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Introduction
As I write this in early 2026, I don't think it is an understatement to say that the opportunities and impact that are upon us with AI in healthcare feels like a Cambrian Explosion moment. Healthcare professionals, administrators, and patients alike will be increasingly chatting with, talking directly to, and collaborating with artificial intelligence software systems in entirely new ways. This will need to be done safely and carefully.
What worked five years ago, or even two years ago, is increasingly inadequate for the demands of clinical AI, population health analytics, and real-time decision support. For technical architects navigating this shift, the challenge isn't just scaling storage; it's rethinking the entire data architecture.
This discussion document shares my thoughts about an approach to persistence for the Helios FHIR Server.
This document is an architecture strategy document. In other words, it describes the main motivating direction, building blocks, and key technology ingredients that will makeup the persistence design for the Helios FHIR Server. It is not intended to be a comprehensive set of requirements and design, but instead contains enough of a starting point such that readers can understand our approach to persistence, and understand why we decided to make the decisions that we did.
Who should read this?
The Helios FHIR Server is open source software, and is being developed in the open. If you have some interest in persistence design for healthcare software - this document is for you!
My hope is that you will think about the contents of this document, comment and provide feedback!
AI Is Driving New Requirements on Data
AI workloads have upended traditional assumptions about data access patterns. Training models demand sustained high-throughput reads across massive datasets, while inference requires low-latency access to distributed data sources. In healthcare, this is compounded by the explosive growth of unstructured data. Radiology images, pathology slides, genomic sequences, clinical notes, and waveform data from monitoring devices to name a few. Structured EHR data, once the center of gravity, is increasingly extracted from the EMR and compared with other external data sources. Architectures optimized for transactional workloads simply cannot deliver the performance AI pipelines require, and retrofitting them is often a losing battle.
Separation of Storage and Compute
Decoupling storage from compute has moved from a cloud-native best practice to an architectural necessity, yet many FHIR server implementations haven't caught up. While cloud-based analytics platforms routinely embrace this separation, transactional FHIR servers often remain tightly coupled to their persistence layers, treating database and application as an inseparable unit. This creates painful trade-offs: over-provisioning compute to get adequate storage, or vice versa. A modern FHIR server must separate these concerns as a core architectural principle, allowing the API layer to scale horizontally for request throughput while the persistence layer scales independently for capacity and query performance. In healthcare AI workloads, this separation is especially critical. Spin up GPU clusters for model training without provisioning redundant storage, or expand storage for imaging archives without paying for idle compute. The persistence layer becomes a service with its own scaling characteristics rather than a monolithic dependency. This separation is now expected as a defining characteristic of production-ready FHIR infrastructure.
Medallion Architecture Within FHIR Persistence
We have seen our largest petabyte-scale customers transition to a Medallion Architecture strategy for their FHIR data. The bronze layer represents resources as received, preserving original payloads, source system identifiers, and ingestion metadata for auditability and replay. The silver layer applies normalization: terminology mapping, reference resolution, deduplication of resources that represent the same clinical entity, and enforcement of business rules that go beyond FHIR validation. The gold layer materializes optimized views for specific consumers, denormalized patient summaries for clinical applications, flattened tabular projections for analytics, or pre-computed feature sets for ML pipelines.
Hybrid and Multi-Cloud Architectures
The reality for most health IT systems is a hybrid footprint: on-premises data centers housing legacy systems and sensitive workloads, cloud platforms providing elastic compute for AI and analytics, and edge infrastructure at clinical sites. Multi-cloud strategies add another dimension, whether driven by M&A activity, best-of-breed vendor selection, or risk diversification.
Security-First and Zero-Trust Patterns in FHIR Persistence
The persistence layer is where FHIR data lives at rest, making it the most critical surface for security enforcement. Zero-trust principles must be embedded in the persistence design itself, not just the API layer above it. This means encryption at rest as a baseline, but also fine-grained access control at the resource, compartment or even finer-grained levels - ensuring that database-level access cannot bypass FHIR authorization semantics. Audit logging must capture all persistence operations with sufficient detail for HIPAA accounting-of-disclosures requirements. This typically means persisting AuditEvent resources to a separately controlled store. Consent enforcement, particularly for sensitive resource types like mental health or substance abuse records under 42 CFR Part 2, often requires persistence-layer support through segmentation, tagging, or dynamic filtering. Treating security as an API-layer concern while leaving the persistence layer permissive creates unacceptable risk.
Data Retention, Tiering, and Cost Optimization
FHIR persistence layers accumulate data over years and decades. Version history, provenance records, and audit logs all create significant cost pressure. Intelligent tiering within the persistence layer moves older resource versions and infrequently accessed resources to lower-cost storage classes while keeping current data on performant storage. The architectural challenge is maintaining query semantics across tiers: a search that spans active and archived resources should work transparently, even if archived retrieval is slower. Retention policies must account for regulatory requirements that vary by resource type. Imaging studies may have different retention mandates than clinical notes. A well-designed persistence layer makes tiering a configuration concern rather than an architectural constraint.
Different Data Technologies for Different Problems
A FHIR persistence layer that commits to a single storage technology is making a bet that one tool can serve all masters. This is a bet that rarely pays off as requirements evolve. The reality is that different access patterns, query types, and workloads have fundamentally different performance characteristics, and no single database technology optimizes for all of them. A patient lookup by identifier, a population-level cohort query, a graph traversal of care team relationships, and a semantic similarity search for clinical trial matching across different terminology code systems are all legitimate operations against FHIR data, yet each performs best on a different underlying technology.
Relational Databases remain the workhorse for transactional FHIR operations, offering ACID guarantees, mature tooling, and well-understood query optimization for structured data with predictable access patterns.
NoSQL Databases - particularly document stores - align naturally with FHIR's resource model, persisting resources as complete documents without the impedance mismatch of relational decomposition, and scaling horizontally for high-throughput ingestion. Additionally, Cassandra has been exceptional at handling web-scale data requirements without breaking the bank.
Data Lakes provide cost-effective, schema-flexible storage for raw FHIR resources and bulk exports, serving as the foundation for large-scale analytics and ML training pipelines that need to process millions of resources.
Data Warehouses deliver optimized analytical query performance over structured, transformed FHIR data, enabling population health analytics, quality measure computation, and business intelligence workloads that would overwhelm transactional systems.
Graph Databases excel at traversing relationships. Patient to provider to organization to care team is an example relationship pathway that are represented as references in FHIR but are expensive to navigate through recursive joins in relational systems.
Vector Databases enable semantic search and similarity matching over embedded representations of clinical text, supporting AI use cases like similar-patient retrieval, terminology matching, and contextual search that go beyond keyword-based FHIR queries.
Block Storage provides the high-performance, low-latency foundation for database engines themselves, while also serving large binary attachments, imaging data, scanned documents, and waveforms that are referenced by FHIR resources but impractical to store within the resource payload.
The architectural discipline is not choosing one technology but designing the abstraction layer that routes FHIR operations to the appropriate backend while maintaining consistency, security, and a coherent developer experience.
Positioning the Helios FHIR Server in the FHIR Server Landscape
The FHIR server landscape can be understood along two architectural dimensions: how tightly the implementation is coupled to its storage technology, and whether the system supports multiple specialized data stores or requires a single backend.
The vertical axis distinguishes between servers with tightly-coupled persistence where the implementation is deeply intertwined with a specific database technology, and those offering an extensible interface layer that abstracts storage concerns behind well-defined interfaces. A FHIR Server built directly on JPA (Java Persistence API) is such an example, meaning its data access patterns, query capabilities, and performance characteristics are fundamentally shaped by relational database assumptions. In contrast, an extensible interface layer defines traits or interfaces that can be implemented for any storage technology, allowing the same FHIR API to sit atop different backends without rewriting core logic.
The horizontal axis captures the difference between single storage backend architectures and polyglot persistence. Polyglot persistence is an architectural pattern where different types of data are routed to the storage technologies best suited for how that data will be accessed. For example, a polyglot system might store clinical documents in an object store optimized for large binary content, maintain patient relationships in a graph database for efficient traversal, and keep structured observations in a columnar store for fast analytical queries all while presenting a unified FHIR API to consuming applications. Most existing FHIR servers force all resources into a single database, sacrificing performance and flexibility for implementation simplicity.
The Helios FHIR Server occupies the upper-right quadrant: it combines a trait-based, open-source interface layer built in Rust with native support for polyglot persistence. This architecture allows organizations to optimize storage decisions for their specific access patterns while maintaining full FHIR compliance at the API layer.
Decomposing the FHIR Specification: Separation of Concerns in Persistence Design
The FHIR specification is vast. It defines resource structures, REST interactions, search semantics, terminology operations, versioning behavior, and much more. A monolithic interface, or trait that attempts to capture all of this becomes unwieldy, difficult to implement, and impossible to optimize for specific storage technologies. The Helios FHIR Server persistence design takes a different approach: decompose the specification into cohesive concerns, express each as a focused trait, and compose them to build complete storage backends.
Learning from Diesel: Type-Safe Database Abstractions
Before diving into our trait design, it's worth examining what we can learn from Diesel, Rust's most mature database abstraction layer. Diesel has solved many of the problems we face - multi-backend support, compile-time query validation, extensibility, and its design choices offer valuable lessons.
Backend Abstraction via Traits, Not Enums: Diesel defines a
Backendtrait that captures the differences between database systems (PostgreSQL, MySQL, SQLite) without coupling to specific implementations. TheBackendtrait specifies how SQL is generated, how bind parameters are collected, and how types are mapped. This allows new backends to be added without modifying core code. This is exactly what we need for polyglot FHIR persistence.QueryFragment for Composable SQL Generation: Diesel's
QueryFragmenttrait represents any piece of SQL that can be rendered. A WHERE clause, a JOIN, an entire SELECT statement all implementQueryFragment. This composability lets complex queries be built from simple pieces. For FHIR search, we can adopt a similar pattern: each search parameter modifier becomes a fragment that can be composed into complete queries.Type-Level Query Validation: Diesel catches many errors at compile time by encoding schema information in the type system. While we can't achieve the same level of compile-time validation for dynamic FHIR queries, we can use Rust's type system to ensure that storage backends only claim to support operations they actually implement.
MultiConnection for Runtime Backend Selection: Diesel's
#[derive(MultiConnection)]generates an enum that wraps multiple connection types, dispatching operations to the appropriate backend at runtime. This pattern directly applies to polyglot persistence. We can route FHIR operations to different backends based on query characteristics.Extensibility via sql_function! and Custom Types: Diesel makes it trivial to add custom SQL functions and types. For FHIR, this translates to extensibility for custom search parameters, terminology operations, and backend-specific optimizations.
The Core Resource Storage Trait
At the foundation is the
ResourceStoragetrait, which handles the fundamental persistence of FHIR resources. This trait intentionally knows nothing about search, nothing about REST semantics, nothing about transactions. It simply stores and retrieves resources by type and identifier.Multitenancy is not optional in this design. Every operation requires a
TenantContext, making it impossible at the type level to accidentally execute a query without tenant scoping. There is no "escape hatch" that bypasses tenant isolation.Notice what's absent: there's no
if_matchparameter for optimistic concurrency, no version-specific reads, no history. Those capabilities belong to separate traits that extend the base functionality. A storage backend that doesn't support versioning simply doesn't implement the versioning trait.Multitenancy: A Cross-Cutting Concern
Multitenancy has downstream implications for every layer of a FHIR server, from indexing strategy to reference validation to search semantics. By requiring tenant context at the lowest storage layer, we ensure that isolation guarantees propagate upward through the entire system.
Isolation Strategies
There are three fundamental approaches to tenant isolation, each with different trade-offs:
Database-per-tenant: Strongest isolation, simplest security model, easier compliance story. The downside is operational overhead that grows linearly with tenants. Connection pool management becomes complex, and schema migrations are painful at scale.
Schema-per-tenant: Good isolation within a single database instance, allows tenant-specific indexing. PostgreSQL handles this well. Still has schema migration coordination challenges.
Shared schema with tenant discriminator: Most operationally efficient at scale, single migration path. The risk is that every query must include tenant filtering. One missed WHERE clause and you have a data breach.
For SQL-backed FHIR persistence, the shared schema approach with a
tenant_iddiscriminator is pragmatic, but the enforcement layer must be airtight - you literally cannot construct a storage operation without providing tenant context.Tenant Context as a Type-Level Guarantee
Borrowing from Diesel's approach to type safety, we can make tenant context explicit in the type system. Rather than passing tenant IDs as strings that might be forgotten, we create a wrapper type that must be present for any storage operation:
Shared Resources and the System Tenant
CodeSystems, ValueSets, StructureDefinitions, and other conformance resources are typically shared across tenants. We designate a "system" tenant that holds these shared resources:
Index Design for Multitenancy
Search performance in a multitenant system depends critically on index design. The
tenant_idmust be the leading column in composite indexes:Versioning as a Separate Concern
FHIR's versioning model is sophisticated: every update creates a new version, version IDs are opaque strings, and the
vreadinteraction retrieves historical versions. Not all storage backends can efficiently support this. An append-only data lake handles versioning naturally; a key-value store might not.History: Building on Versioning
History access naturally extends versioning. If a backend can read specific versions, it can also enumerate them:
The trait hierarchy
HistoryProvider: VersionedStorage: ResourceStoragemeans that any storage backend supporting history automatically supports versioned reads and basic CRUD - all within tenant boundaries. The type system enforces this relationship.The Search Abstraction: Decomposing FHIR's Query Model
Search is where the FHIR specification becomes genuinely complex. There are eight search parameter types (number, date, string, token, reference, quantity, uri, composite), sixteen modifiers (
:exact,:contains,:not,:missing,:above,:below,:in,:not-in,:of-type,:identifier,:text,:code-text,:text-advanced,:iterate, plus resource type modifiers on references), six comparison prefixes (eq,ne,lt,le,gt,ge,sa,eb,ap), chained parameters, reverse chaining (_has),_includeand_revincludedirectives, and advanced filtering via_filter. A single search query can combine all of these all while respecting tenant boundaries.Modeling search as a single trait would be a mistake. Instead, we decompose it into layers - and here, Diesel's
QueryFragmentpattern proves invaluable.The SearchFragment Pattern (Inspired by Diesel's QueryFragment)
Diesel's
QueryFragmenttrait allows any piece of SQL to be composable. We adapt this pattern for FHIR search, creating fragments that can be combined into complete search queries:Each search modifier becomes a fragment that knows how to render itself:
Search Parameter Types
First, we model the search parameter types and their associated matching logic:
The Core Search Trait
The base search trait handles fundamental query execution without advanced features:
Advanced Search Capabilities as Extension Traits
Not every storage backend can support every search feature. A relational database might handle token searches efficiently but struggle with subsumption queries that require terminology reasoning. A vector database might excel at text search but lack native support for date range queries. We model these variations as extension traits.
Chained Search Provider:
Terminology Search Provider:
Text Search Provider:
This decomposition has practical consequences. When configuring a polyglot persistence layer, we can route terminology-aware searches to a backend that integrates with a terminology server (perhaps backed by a graph database), while directing simple token matches to a faster document store. The trait system makes these routing decisions explicit and type-safe.
Transactions: When Atomicity Matters
FHIR defines batch and transaction bundles. A batch processes entries independently; a transaction either succeeds completely or fails entirely with no partial effects. This all-or-nothing semantics requires database-level transaction support - something not all storage technologies provide natively.
A storage backend that doesn't support transactions can still handle batch operations. It simply processes each entry independently, accepting that failures may leave partial results. The trait separation makes this distinction clear: code that requires atomicity takes
&dyn TransactionProvider, while code that can tolerate partial failures takes&dyn ResourceStorage.Audit Events: A Separated Persistence Store
AuditEvent resources should be ideally stored separately from clinical data. This isn't just a security concern, it's also an architectural one. Audit logs have different access patterns (append-heavy, rarely queried except during investigations), different retention requirements (often longer than clinical data), and different security constraints (must be tamper-evident, may require separate access controls).
The separation of
AuditStoragefromResourceStorageenables critical architectural flexibility. Audit events can flow to a dedicated time-series database optimized for append-only writes, or to an immutable ledger for tamper evidence, or to a separate cloud account for security isolation.The REST Layer: Mapping HTTP to Storage
The FHIR REST API defines interactions (read, vread, update, create, delete, search, etc.) that map HTTP verbs and URL patterns to operations. This mapping is a separate concern from storage. The same storage backend might be accessed via REST, GraphQL, messaging, or bulk export.
The
RestHandleris a coordination layer that combines multiple storage traits to implement FHIR REST semantics. A read interaction needs onlyResourceStorage. A vread needsVersionedStorage. A search with_includeneeds bothSearchProviderandResourceStorage. The REST handler composes these capabilities based on what the request requires and what the storage backend provides.Capability Statements: Documenting What Storage Supports
The FHIR specification requires servers to publish a CapabilityStatement declaring which interactions, resources, and search parameters they support. When storage backends have different capabilities, this statement must accurately reflect the union of what's available and identify gaps.
Diesel solves a similar problem with its type system. Operations that aren't supported simply don't compile. For FHIR, we need runtime capability discovery because queries are dynamic. We model storage capabilities as a queryable trait that can generate CapabilityStatement fragments:
Dynamic Capability Checking
For operations that can't be checked at compile time, we provide runtime capability checking that fails fast with clear error messages:
The Feature Support Matrix
Different storage technologies have different strengths. A key deliverable of the Helios FHIR Server's persistence design is a clear feature support matrix that documents what each storage backend provides. This (example, work-in-progress) matrix drives both the CapabilityStatement generation and helps operators choose the right backend for their workload.
This matrix isn't static. It's generated from the
StorageCapabilitiesimplementations. When a new storage backend is added or an existing one gains features, the matrix updates automatically.Composing Storage Backends (Inspired by Diesel's MultiConnection)
Diesel's
MultiConnectionderive macro generates an enum that wraps multiple connection types, dispatching to the appropriate backend at runtime. We adapt this pattern for polyglot FHIR persistence, but with intelligent routing based on query characteristics:The routing logic becomes explicit policy that considers both capabilities and cost:
The Path Forward
This trait-based decomposition provides a foundation for building a FHIR persistence layer that can evolve with requirements. When AI workloads demand vector similarity search, we add a
VectorSearchProvidertrait and plug in a vector database. When regulatory requirements demand immutable audit trails, we implementAuditStorageagainst an append-only ledger. When performance analysis reveals that graph traversals are bottlenecking population health queries, we route those operations to a dedicated graph database.Extensibility Following Diesel's Model: Just as Diesel's
sql_function!macro makes it trivial to add custom SQL functions, our design should make it easy to add custom search parameters and modifiers. A healthcare organization might need a custom:phoneticmodifier for patient name matching, or a:geo-nearmodifier for location-based searches. TheSearchFragmentpattern enables this:This is what it means to build FHIR persistence for the AI era: not a monolithic database adapter, but a composable system of specialized capabilities that can be assembled to meet the specific needs of each deployment with tenant isolation, search routing, and extensibility built into the architecture from the start.
Thank you!
I very much look forward to your thoughts on these ideas and to the discussions that follow.
Sincerely,
-Steve
Beta Was this translation helpful? Give feedback.
All reactions