From b84d280ec5f8aa596813eceb34781e36c796ac03 Mon Sep 17 00:00:00 2001 From: Tomasz Wejroch Date: Wed, 17 Dec 2025 18:27:18 +0100 Subject: [PATCH 1/2] Update docs for release --- README.md | 332 +++++++++++----------------- dbzero/dbzero/atomic.py | 2 +- dbzero/dbzero/compare.py | 6 +- dbzero/dbzero/dbzero.pyi | 109 ++++----- dbzero/dbzero/fast_query.py | 6 +- dbzero/dbzero/storage_api.py | 4 +- src/dbzero/core/utils/hash_func.cpp | 1 + 7 files changed, 190 insertions(+), 270 deletions(-) diff --git a/README.md b/README.md index 84d82d92..a74fdd76 100644 --- a/README.md +++ b/README.md @@ -4,43 +4,35 @@ [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) ---- - ## Overview **dbzero** lets you code as if you have infinite memory. Inspired by a thought experiment from *Architecture Patterns with Python* by Harry Percival and Bob Gregory, dbzero handles the complexities of data management in the background while you work with simple Python objects. dbzero implements the **DISTIC memory** model: -- **D**urable - Data persists automatically across application restarts +- **D**urable - Data persists across application restarts - **I**nfinite - Work with data as if memory constraints don't exist - **S**hared - Multiple processes can access and share the same data - **T**ransactions - Transaction support for data integrity - **I**solated - Operations are isolated and thread-safe - **C**omposable - A single process can integrate multiple memory partitions (prefixes) to suit its specific requirements -The result is a simplified application stack that eliminate the need for separate databases, ORMs and caching layers. This reduces architectural complexity and development time, while offering significant performance benefits in some cases, due to reduced serialization overhead and cache locality. +The result is a simplified application stack that eliminates the need for separate databases, ORMs and caching layers. This reduces architectural complexity and development time, while offering significant performance benefits, due to reduced serialization overhead and cache locality. ---- +## Key Platform Features -## Key Features +**dbzero** provides the reliability of a traditional database system with modern capabilities and extra features on top. -### Database Capabilities -- **ACID Transactions** - Serializable consistency guarantees for all data operations -- **Single-writer, Multiple-reader** concurrency model -- **Automatic Persistence** - Objects are automatically saved to disk -- **Efficient Caching** - Only accessed data is loaded into memory -- **Time Travel** - Query historical states at any transaction point -- **Data Partitioning** - Build distributed architectures using prefixes and distributed transactions -- **Custom Data Models** - Unlike traditional databases, dbzero allows you to define custom data structures that match your domain's needs - -### Developer Experience -- **Invisible by Design** - Minimal API surface; write regular Python code -- **Dynamic Schema** - No separate data models or schema definitions needed -- **AI-Friendly** - Works exceptionally well with AI coding agents -- **Type Support** - Full support for Python's built-in types and collections -- **Reference Counting** - Automatic garbage collection of unused objects - ---- +- **Persistence**: Application objects (classes and common structures like `list`, `dict`, `set`, etc.) are automatically persisted to the underlying storage medium (e.g. a local file). +- **Efficient caching**: Only the data actually accessed is retrieved and cached. For example, if a list has 1 million elements but only 10 are accessed, only those 10 are loaded. +- **Constrained memory usage**: You can define memory limits for the process to control RAM consumption. +- **Serializable consistency**: Data changes are immediately visible to readers, maintaining a consistent view. +- **Transactions**: Make atomic changes using the `with dbzero.atomic():` context manager. +- **Snapshots & Time Travel**: Query data as it existed at a specific point in the past. This enables tracking of data changes and simplify auditing. +- **Tags**: Tag objects and use tags to filter or retrieve data. +- **Indexing**: Define imperative indexes that can be dynamically created and updated. +- **Data composability**: Combine data from different apps, processes, or servers and access it through a unified interface. +- **UUID support**: All objects are automatically assigned a universally unique identifier, allowing to always reference them directly. +- **Custom data models** - Unlike traditional databases, dbzero allows you to define custom data structures to match your domain's needs. ## Requirements @@ -49,8 +41,6 @@ The result is a simplified application stack that eliminate the need for separat - **Storage**: Local filesystem or network-attached storage - **Memory**: Varies by workload; active working set should fit in RAM for best performance ---- - ## Quick Start ### Installation @@ -59,43 +49,40 @@ The result is a simplified application stack that eliminate the need for separat pip install dbzero ``` -### Hello World Example +### Simple Example + +The guiding philosophy behind **dbzero** is *invisibility*—it stays out of your way as much as possible. In most cases, unless you're using advanced features, you won’t even notice it’s there. No schema definitions, no explicit save calls, no ORM configuration. You just write regular Python code, as you always have. See the complete working example below: ```python -from dataclasses import dataclass import dbzero as db0 @db0.memo(singleton=True) -@dataclass -class Root: - greeting: str +class GreeterAppRoot: + def __init__(self, greeting, persons): + self.greeting = greeting + self.persons = persons + self.counter = 0 def greet(self): - print(self.greeting) + print(f"{self.greeting}{''.join(f', {person}' for person in self.persons)}!") + self.counter += 1 if __name__ == "__main__": - db0.init(dbzero_root="./data") - - if db0.exists(Root): - print("Process started with data...") - root = db0.fetch(Root) - else: - root = Root("Hello, World!") - - root.greet() # Output: Hello, World! + # Initialize dbzero + db0.init("./app-data", prefix="main") + # Initialize the application's root object + root = GreeterAppRoot("Hello", ["Michael", "Jennifer"]) + root.greet() # Output: Hello, Michael, Jennifer! + print(f"Greeted {root.counter} times.") ``` -When the process exits, the application state is persisted automatically. The same data will be available the next time the app starts. - -**Note:** All objects linked to `Root` (and any objects they reference) are automatically managed by dbzero. There's no need for explicit conversions, fetching, or saving — dbzero handles persistence transparently for the entire object graph. - ---- +The application state is persisted automatically; the same data will be available the next time the app starts. All objects are automatically managed by dbzero and there's no need for explicit conversions, fetching, or saving — dbzero handles persistence transparently for the entire object graph. ## Core Concepts ### Memo Classes -Transform any Python class into a persistent, managed object by applying the `@db0.memo` decorator: +Transform any Python class into a persistent, automatically managed object by applying the `@db0.memo` decorator: ```python import dbzero as db0 @@ -109,171 +96,145 @@ class Person: # Instantiation works just like regular Python person = Person("Alice", 30) -# Attributes can be modified dynamically +# Attributes can be changed dynamically person.age += 1 person.address = "123 Main St" # Add new attributes on the fly ``` -**Features of memo classes:** -- Automatic durability and persistence -- Efficient memory/disk management -- Automatic UUID assignment -- Support for tagging and querying -- Reference counting for garbage collection - ### Collections -dbzero provides durable, transactional versions of Python's built-in collections: +dbzero provides persistent versions of Python's built-in collections: ```python -# Create persistent collections -my_dict = db0.dict(name="Alice", age=30) -my_list = db0.list([1, 2, 3, 4, 5]) -my_tuple = db0.tuple(("a", "b", "c")) - -# They work just like standard Python collections -my_dict["city"] = "New York" -my_list.append(6) -print(len(my_list)) # 6 -``` +from datetime import date -All standard operations are supported, and changes are automatically persisted within transactions. +person = Person("John", 25) -### Transactions +# Assign persistent collections to memo object +person.appointment_dates = {date(2026, 1, 12), date(2026, 1, 13), date(2026, 1, 14)} -By default, dbzero uses **autocommit** mode, periodically persisting changes to storage: +person.skills = ["Python", "C++", "Docker"] -```python -# Manual commit for batch operations -tasks = db0.list() -for i in range(100): - tasks.append(Task(f"Task {i}")) +person.contact_info = { + "email": "john@example.com", + "phone": "+1-555-0100", + "linkedin": "linkedin.com/in/john" +} + +# Use them as usual +date(2026, 1, 13) in person.appointment_dates # True + +person.skills.append("Kubernetes") +print(person.skills) # Output: ['Python', 'C++', 'Docker', 'Kubernetes'] + +person.contact_info["github"] = "github.com/john" +person.contact_info["email"] # "john@example.com" -db0.commit() # Persist all 100 tasks at once ``` +All standard operations are supported, and changes are automatically persisted. + ### Queries and Tags -Find objects using type-based queries and flexible tag logic: +Find objects using tag-based queries and flexible logic operators: ```python # Create and tag objects -person = Person("Alice", 30) -db0.tags(person).add("employee") -db0.tags(person).add("manager") +person = Person("Susan", 31) +db0.tags(person).add("employee", "manager") -# Find by type -all_persons = db0.find(Person) +person = Person("Michael", 29) +db0.tags(person).add("employee", "developer") -# Find by tag -managers = db0.find("manager") +# Find every Person by type +result = db0.find(Person) -# Combine type and tags (AND logic) -employee_persons = db0.find(Person, "employee") +# Combine type and tags (AND logic) to find employees +employees = db0.find(Person, "employee") -# OR logic using a list -results = db0.find(["tag1", "tag2"]) +# OR logic using a list to find managers and developers +staff = db0.find(["manager", "developer"]) -# NOT logic using db0.no() +# NOT logic using db0.no() to find employees wich aren't managers non_managers = db0.find("employee", db0.no("manager")) ``` ### Snapshots and Time Travel -Create isolated, read-only views of your data at any point in time: +Create isolated views of your data at any point in time: ```python -# Create a snapshot of the current state -with db0.snapshot() as snap: - # Changes to live objects won't affect the snapshot - obj.value = 999 - - # Query the snapshot - sees old data - old_obj = snap.fetch(MemoTestClass) - assert old_obj.value == 123 - -# Time travel to a specific transaction -state_num = db0.get_state_num() -# ... make changes ... - -with db0.snapshot(state_num) as past_snap: - # Access data exactly as it was at that state - past_obj = past_snap.fetch(obj_uuid) +person = Person("John", 25) +person.balance = 1500 +# Keep the current state +state = db0.get_state_num() +# Commit changes explicitely to advance the state immediately +db0.commit() + +# Change the balance +person.balance -= 300 +db0.commit() + +print(f"{person.name} balance: {person.balance}") # John balance: 1200 +# Open snapshot view with past state number +with db0.snapshot(state) as snap: + past_person = db0.fetch(db0.uuid(person)) + print(f"{past_person.name} balance: {past_person.balance}") # John balance: 1500 ``` ### Prefixes (Data Partitioning) -Organize data into isolated partitions with independent commit histories: +Organize data into independent, isolated partitions: ```python -# Scope a class to a specific prefix -@db0.memo(prefix="settings-prefix") +@db0.memo(singleton=True, prefix="settings-prefix") class AppSettings: def __init__(self, theme: str): self.theme = theme -# Open and work with different prefixes -db0.open("user-data", "rw") # Set current prefix -settings = AppSettings(theme="dark") # Goes to "settings-prefix" -note = UserNote(content="Hello") # Goes to "user-data" +@db0.memo(prefix="app-data-prefix")) +class Note: + def __init__(self, content: str): + self.content = content + +settings = AppSettings(theme="dark") # Data goes to "settings-prefix.db0" +note = Note("Hello dbzero!") # Data goes to "app-data-prefix.db0" ``` ### Indexes -Create fast, sorted access to your data: +Index your data for fast range queries and sorting: ```python -# Create an index -priority_queue = db0.index() - -# Add items with keys for sorting -task1 = Task("High priority task") -task2 = Task("Low priority task") - -priority_queue.add(1, task1) # key=1 -priority_queue.add(10, task2) # key=10 - -# Iterate in sorted order -for key, task in priority_queue: - print(f"Priority {key}: {task.description}") -``` - ---- - -## Advanced Features - -### Multi-Process Synchronization +from datetime import datetime -dbzero automatically synchronizes data between processes: +@db0.memo() +class Event: + def __init__(self, event_id: int, occured: datetime): + self.event_id = event_id + self.occured = occured -```python -# Writer process -db0.open(prefix_name, "w") -obj.value = 124 -db0.commit() +events = [ + Event(100, datetime(2026, 1, 28)), + Event(101, datetime(2026, 1, 30)), + Event(102, datetime(2026, 1, 29)), + Event(103, datetime(2026, 2, 1)), +] -# Reader process - sees changes automatically -db0.open(prefix_name, "r") -assert obj.value == 124 # Updated value visible -``` - -### Change Data Capture +# Create an index +event_index = db0.index() +# Populate with objects +for event in events: + event_index.add(event.occured, event) -Compare snapshots to identify changes: +# Query events from January 2026 +query = event_index.select(datetime(2026, 1, 1), datetime(2026, 1, 31)) +# Sort ascending by date of occurance +query_sorted = event_index.sort(query) +print([event.event_id for event in query_sorted]) # Output: [100, 102, 101] -```python -snap_before = db0.snapshot() -# ... make changes ... -snap_after = db0.snapshot() - -# Find what changed -new_items = db0.select_new(query, snap_before, snap_after) -deleted_items = db0.select_deleted(query, snap_before, snap_after) -modified_items = db0.select_modified(query, snap_before, snap_after) ``` ---- - ## Scalability dbzero provides tools to build scalable applications: @@ -284,42 +245,20 @@ dbzero provides tools to build scalable applications: These features give you the flexibility to design distributed architectures that fit your needs. ---- - -## Configuration - -Configure dbzero during initialization: - -```python -db0.init( - dbzero_root="/path/to/data", - config={ - 'autocommit': True, - 'autocommit_interval': 367, # milliseconds - 'cache_size': 8 << 30 # 8GiB - } -) -``` - -Or configure specific prefixes: - -```python -db0.open("my-prefix", "rw", autocommit=False) -``` - ---- ## Use Cases +Our experience has proven that **dbzero** fits many real-life use cases, which include: + - **Web Applications** - Unified state management for backend services -- **Data Processing Pipelines** - Efficient batch operations with transaction support -- **Event-Driven Systems** - Change data capture and time travel for auditing +- **Data Processing Pipelines** - Efficient and simple data preparation +- **Event-Driven Systems** - Capturing data changes and time travel for auditing - **AI Applications** - Simplified state management for AI agents and workflows ---- - ## Why dbzero? +The short answer is illustrated by diagram below: + ### Traditional Stack ``` Application Code @@ -340,33 +279,26 @@ Application Code + dbzero Storage ``` -By eliminating intermediate layers, dbzero reduces complexity, improves performance, and accelerates development—all while providing the reliability and features you expect from a database system. - ---- +By eliminating intermediate layers, dbzero reduces complexity, improves performance, and accelerates development—all while providing the reliability and features you expect from a regular database system. ## Documentation -For comprehensive documentation, visit: **[docs.dbzero.io](https://docs.dbzero.io)** +Check our docs to learn more: **[docs.dbzero.io](https://docs.dbzero.io)** -Topics covered: -- API Reference +There you can find: +- Guides - Tutorials -- Data Modeling Patterns -- Performance Optimization -- Other Guides - ---- +- Performance tips +- API Reference ## License -This project is licensed under the GNU Affero General Public License v3.0 (AGPLv3). See `LICENSE` for the full text. +This project is licensed under the GNU Affero General Public License v3.0 (AGPLv3). See [LICENSE](./LICENSE) for the full text. - If you modify and run this software over a network, you must offer the complete corresponding source code to users interacting with it (AGPLv3 §13). - Redistributions must preserve copyright and license notices and provide source. -For attribution details, see `NOTICE`. - ---- +For attribution details, see [NOTICE](./NOTICE). ## Commercial Support @@ -380,8 +312,6 @@ We offer: Contact us at: **info@dbzero.io** ---- - ## Support - **Documentation**: [docs.dbzero.io](https://docs.dbzero.io) diff --git a/dbzero/dbzero/atomic.py b/dbzero/dbzero/atomic.py index 1649a075..2418e4b9 100644 --- a/dbzero/dbzero/atomic.py +++ b/dbzero/dbzero/atomic.py @@ -50,7 +50,7 @@ def cancel(self): def atomic() -> AtomicManager: - """Create a context manager to group multiple mutating operations into a single indivisible transaction. + """Open a context manager to group multiple mutating operations into a single indivisible transaction. This function ensures that all modifications within the `with` block are applied together, or none are applied at all. If the block completes successfully, all changes are merged into the current diff --git a/dbzero/dbzero/compare.py b/dbzero/dbzero/compare.py index 129d5344..ac6791b1 100644 --- a/dbzero/dbzero/compare.py +++ b/dbzero/dbzero/compare.py @@ -53,18 +53,18 @@ def compare(obj_1: Memo, obj_2: Memo, tags: Optional[List[Tag]] = None) -> bool: >>> >>> # Default comparison ignores tags and returns True >>> # because their content is the same. - >>> assert dbzero.compare(obj_A, obj_B) == True + >>> assert dbzero.compare(obj_A, obj_B) is True >>> >>> # Including the 'featured' tag in the comparison >>> # returns False because obj_A lacks the tag. - >>> assert dbzero.compare(obj_A, obj_B, tags=['featured']) == False + >>> assert dbzero.compare(obj_A, obj_B, tags=['featured']) is False >>> >>> # Now add the tag to obj_A as well >>> dbzero.tags(obj_A).add("featured") >>> dbzero.commit() >>> >>> # The comparison now returns True - >>> assert dbzero.compare(obj_A, obj_B, tags=['featured']) == True + >>> assert dbzero.compare(obj_A, obj_B, tags=['featured']) is True """ if _compare(obj_1, obj_2): # if objects are identical then also compare tags diff --git a/dbzero/dbzero/dbzero.pyi b/dbzero/dbzero/dbzero.pyi index ce20f76c..4ac6d092 100644 --- a/dbzero/dbzero/dbzero.pyi +++ b/dbzero/dbzero/dbzero.pyi @@ -85,7 +85,7 @@ def close(prefix_name: Optional[str] = None) -> None: ... def commit(prefix_name: Optional[str] = None) -> None: - """Save all in-memory object changes to persistent storage. + """Persist all pending data changes. Finalizes the current open transaction, ensuring data is durable and consistent. @@ -115,7 +115,7 @@ def commit(prefix_name: Optional[str] = None) -> None: # Object retrieval and management def fetch(identifier: Union[str, type], expected_type: Optional[type] = None, prefix: Optional[str] = None) -> Memo: - """Retrieve a single object directly from memory using its unique identifier. + """Retrieve a dbzero object instance by its UUID or type (for singletons). The fastest way to access an object, operating in constant time O(1). It is guaranteed that only one instance of an object exists in memory for a given UUID. @@ -132,7 +132,7 @@ def fetch(identifier: Union[str, type], expected_type: Optional[type] = None, pr Raises exception if the fetched object is not an instance of this type. prefix : str, optional Optional name of the data prefix to fetch the object from. - Useful for retrieving singletons from non-default prefixes. + Used for retrieving singletons from non-default prefixes. Returns ------- @@ -169,7 +169,7 @@ def fetch(identifier: Union[str, type], expected_type: Optional[type] = None, pr ... def exists(identifier: Union[str, type], expected_type: Optional[type] = None, prefix: Optional[str] = None) -> bool: - """Check if a dbzero object exists. + """Check if an identifier points to a valid dbzero object or an existing singleton instance Can check by UUID or by singleton type. Allows to verify if an object is still available before trying to retrieve it. @@ -218,15 +218,15 @@ def exists(identifier: Union[str, type], expected_type: Optional[type] = None, p ... def uuid(obj: Memo, /) -> str: - """Get the unique, persistent identifier (UUID) for a dbzero-managed object. + """Get the unique object ID (UUID) of a memo instance. Returns a stable handle that allows the object to be reliably fetched with dbzero.fetch() across sessions. The UUID is a base-32 encoded string. Parameters ---------- - obj : Any - A dbzero-managed object or weak proxy to get the UUID of. + obj : Memo + A memo instance or weak proxy to get the UUID of. Returns ------- @@ -253,7 +253,7 @@ def uuid(obj: Memo, /) -> str: ... def load(obj: Any, /, *, exclude: Optional[Union[List[str], Tuple[str, ...]]] = None, **kwargs: Any) -> Any: - """Recursively convert any object into its equivalent native Python representation. + """Load a dbzero instance recursively into memory as its equivalent native Python representation Useful for exporting application state for APIs or functions expecting standard Python types, like JSON serialization. Intelligently handles both standard Python and dbzero types. @@ -276,7 +276,7 @@ def load(obj: Any, /, *, exclude: Optional[Union[List[str], Tuple[str, ...]]] = * Native types: Returned as-is * dbzero collections: Converted to built-in counterparts (list, tuple, set, dict) - * dbzero.enum values: Converted to string representation + * @dbzero.enum values: Converted to string representation * @dbzero.memo instances: Converted to dictionaries (or using custom __load__ method) Raises @@ -378,7 +378,7 @@ def hash(obj: Any, /) -> int: """ ... -def set_prefix(object: Memo, prefix: Optional[str]) -> None: +def set_prefix(object: Memo, prefix: Optional[str] = None) -> None: """Set the persistence prefix for a Memo instance dynamically at runtime. Allows to control which data prefix an object belongs to. @@ -554,7 +554,7 @@ def rename_field(class_obj: type, from_name: str, to_name: str) -> None: # Cache management def clear_cache() -> None: - """Manually evicts all objects from the in-memory cache. + """Manually evict all objects from the in-memory cache. Examples -------- @@ -856,7 +856,7 @@ def bytearray(source: Union[bytes, Iterable[int]] = b'', /) -> ByteArrayObject: # Tag and query functions def tags(*objects: Memo) -> ObjectTagManager: - """Get a tag manager instance for Memo objects. + """Get a tag manager interface for given Memo objects. Parameters ---------- @@ -866,7 +866,7 @@ def tags(*objects: Memo) -> ObjectTagManager: Returns ------- ObjectTagManager - A ObjectTagManager instance for given Memo objects. + A ObjectTagManager interface for given Memo objects. Examples -------- @@ -900,7 +900,7 @@ def find(*query_criteria: Union[Tag, List[Tag], Tuple[Tag], QueryObject, TagSet] Parameters ---------- - *query_criteria : Union[Tag, List[Tag], Tuple[Tag], Query, TagSet] + *query_criteria : Union[Tag, List[Tag], Tuple[Tag], QueryObject, TagSet] Variable number of criteria to filter objects: * Type: A class to filter by type (includes subclasses) @@ -908,7 +908,7 @@ def find(*query_criteria: Union[Tag, List[Tag], Tuple[Tag], QueryObject, TagSet] * Object tag: Any memo object used as a tag * List of tags (OR): Objects with at least one of the specified tags * Tuple of tags (AND): Objects with all of the specified tags - * Query: Result of another query + * QueryObject: Result of another query * TagSet: Set logical operation. prefix : str, optional Optional data prefix to run the query on. @@ -916,7 +916,7 @@ def find(*query_criteria: Union[Tag, List[Tag], Tuple[Tag], QueryObject, TagSet] Returns ------- - Query + QueryObject An iterable query object. Examples @@ -964,7 +964,7 @@ def no(predicate: Union[str, QueryObject], /) -> TagSet: Parameters ---------- - predicate : str or Query + predicate : str or QueryObject The condition to negate. Returns @@ -979,6 +979,12 @@ def no(predicate: Union[str, QueryObject], /) -> TagSet: >>> # Find active projects but exclude those on hold >>> active_not_on_hold = dbzero.find("active-project", dbzero.no("on-hold")) + Complex exclusions: + + >>> # Find objects with tag1 but not in a specific result set + >>> excluded_set = dbzero.find("excluded-group") + >>> filtered = dbzero.find("tag1", dbzero.no(excluded_set)) + Calculate query deltas (find differences): >>> # Compare snapshots to find changes @@ -987,20 +993,11 @@ def no(predicate: Union[str, QueryObject], /) -> TagSet: >>> >>> # Find newly added (in query_2 but NOT in query_1) >>> newly_added = snap2.find(query_2, dbzero.no(query_1)) # Objects 3, 4 - - Complex exclusions: - - >>> # Find users who are active but not administrators - >>> regular_users = dbzero.find("active", dbzero.no("admin")) - >>> - >>> # Find objects with tag1 but not in a specific result set - >>> excluded_set = dbzero.find("excluded-group") - >>> filtered = dbzero.find("tag1", dbzero.no(excluded_set)) """ ... def as_tag(obj: Union[Memo, MemoWeakProxy, type]) -> Tag: - """Create a searchable Tag object from a Memo instance or class. + """Make a searchable Tag from a Memo instance or class. Allows to use Memo object or class as a label for other objects. Tags created from objects are stable identifiers that will @@ -1049,7 +1046,7 @@ def split_by(tags: List[Tag], query: QueryObject, exclusive: bool = True) -> Que ---------- tags : List[Tag] A list of tags to split results by. - query : Query + query : QueryObject The input query whose result set will be categorized. exclusive : bool, default True Controls handling of items belonging to multiple groups: @@ -1059,9 +1056,9 @@ def split_by(tags: List[Tag], query: QueryObject, exclusive: bool = True) -> Que Returns ------- - Query + QueryObject A new query yielding (item, decorator) tuples where item is from - the original query and decorator is the matched tag/enum group. + the original query and decorator is the matched group. Examples -------- @@ -1108,12 +1105,12 @@ def filter(filter: Callable[[Any], bool], query: QueryObject) -> QueryObject: filter : Callable[[Any], bool] A function or lambda that takes a single object as argument. Must return True to include the object, False to exclude it. - query : Query + query : QueryObject A query to filter. Returns ------- - Query + QueryObject A query that only yields items for which filter function returned True. Examples @@ -1144,21 +1141,13 @@ def filter(filter: Callable[[Any], bool], query: QueryObject) -> QueryObject: ... lambda city: match_tokens_in_order(city.name, search_phrase), ... results ... ) - - Complex conditions: - - >>> # Filter users by multiple criteria - >>> active_adults = dbzero.filter( - ... lambda user: user.age >= 18 and user.status == "active", - ... dbzero.find(User) - ... ) """ ... # State and statistics functions def get_state_num(prefix: Optional[str] = None, finalized: bool = False) -> int: - """Return the state number for a given data prefix as a version identifier. + """Return the state number for a given data prefix. The state number increments with each transaction commit, crucial for tracking changes, creating snapshots of specific states, and synchronization tasks. @@ -1205,7 +1194,7 @@ def get_state_num(prefix: Optional[str] = None, finalized: bool = False) -> int: # Snapshot functions def snapshot(state_spec: Optional[Union[int, Dict[str, int]]] = None) -> Snapshot: - """Create a read-only, point-in-time view of the prefix for time-travel queries. + """Get a read-only snapshot view of dbzero state. Essential for isolating long-running queries from concurrent writes, analyzing past states, or ensuring consistent state for complex operations. @@ -1252,7 +1241,7 @@ def snapshot(state_spec: Optional[Union[int, Dict[str, int]]] = None) -> Snapsho ... def get_snapshot_of(obj: Memo, /) -> Snapshot: - """Retrieve the Snapshot instance from which a given object originates. + """Get the Snapshot instance from which a given object originates. Parameters ---------- @@ -1281,7 +1270,7 @@ def get_snapshot_of(obj: Memo, /) -> Snapshot: ... def is_memo(obj: Any, /) -> bool: - """Check if a given object is a dbzero memo class or instance of one. + """Check if a given object is a dbzero memo class or memo instance. Parameters ---------- @@ -1300,29 +1289,29 @@ def is_memo(obj: Any, /) -> bool: Check memo class and instance: >>> @dbzero.memo - ... class MemoizedClass: + ... class MemoClass: ... def __init__(self, value): ... self.value = value >>> - >>> memo_instance = MemoizedClass(42) - >>> assert dbzero.is_memo(MemoizedClass) == True # Class type - >>> assert dbzero.is_memo(memo_instance) == True # Instance + >>> memo_instance = MemoClass(42) + >>> assert dbzero.is_memo(MemoClass) is True # Class type + >>> assert dbzero.is_memo(memo_instance) is True # Instance Check non-memo objects: >>> class RegularClass: ... pass - >>> assert dbzero.is_memo(RegularClass) == False - >>> assert dbzero.is_memo(123) == False - >>> assert dbzero.is_memo("hello") == False - >>> assert dbzero.is_memo([1, 2, 3]) == False + >>> assert dbzero.is_memo(RegularClass) is False + >>> assert dbzero.is_memo(123) is False + >>> assert dbzero.is_memo("hello") is False + >>> assert dbzero.is_memo([1, 2, 3]) is False Check other dbzero types: >>> Colors = dbzero.enum("Colors", ["RED", "GREEN", "BLUE"]) >>> managed_list = dbzero.list([1, 2, 3]) - >>> assert dbzero.is_memo(Colors.RED) == False - >>> assert dbzero.is_memo(managed_list) == False + >>> assert dbzero.is_memo(Colors.RED) is False + >>> assert dbzero.is_memo(managed_list) is False """ ... @@ -1355,8 +1344,8 @@ def is_singleton(obj: Any, /) -> bool: Check singleton status: - >>> assert dbzero.is_singleton(user_alice) == False - >>> assert dbzero.is_singleton(app_settings) == True + >>> assert dbzero.is_singleton(user_alice) is False + >>> assert dbzero.is_singleton(app_settings) is True """ ... @@ -1420,7 +1409,7 @@ def is_enum_value(value: Any, /) -> bool: ... def get_schema(cls: type, /) -> Dict[str, Dict[str, Any]]: - """Introspect all in-memory instances of a @dbzero.memo class to generate dynamic schema. + """Introspect all in-memory instances of a @dbzero.memo class to deduce dynamic schema. Provides current overview of attributes and their most common data types across all objects of the class. Schema adapts to runtime changes. @@ -1474,7 +1463,7 @@ def get_schema(cls: type, /) -> Dict[str, Dict[str, Any]]: ... def get_config() -> Dict[str, Any]: - """Retrieve the active configuration settings for the dbzero instance. + """Retrieve the active configuration settings for dbzero. Get the configuration currently in use, including both parameters provided during dbzero.init() and default values for unspecified parameters. @@ -1490,7 +1479,7 @@ def get_config() -> Dict[str, Any]: Raises ------ Exception - If called after the dbzero instance has been closed with dbzero.close(). + If called after the dbzero has been closed with dbzero.close(). Examples -------- @@ -1756,7 +1745,7 @@ def weak_proxy(obj: Memo) -> MemoWeakProxy: ... def expired(proxy_object: MemoWeakProxy) -> bool: - """Check if the target object of a dbzero.weak_proxy has been garbage-collected. + """Check if a weak reference proxy has expired (the object was garbage collected) Used to determine if the original object still exists and can be accessed. diff --git a/dbzero/dbzero/fast_query.py b/dbzero/dbzero/fast_query.py index d552721f..7c10a05b 100644 --- a/dbzero/dbzero/fast_query.py +++ b/dbzero/dbzero/fast_query.py @@ -321,7 +321,7 @@ def group_by(group_defs: Union[Callable, Tag, Tuple], query: QueryObject, ops: T Parameters ---------- - group_defs : lambda | Iterable[EnumValue] | str | tuple + group_defs : lambda | Tag | tuple The criteria used to group the objects. This can be: * A lambda function: Applied to each object to determine its grouping key. @@ -330,8 +330,8 @@ def group_by(group_defs: Union[Callable, Tag, Tuple], query: QueryObject, ops: T The group keys will be the string names of the enum members. * A tuple of the above: For multi-level grouping. The resulting dictionary keys will be tuples. - query : Any - A dbzero QueryObject to be grouped. + query : QueryObject + A dbzero query to be grouped. ops : tuple of callable, default (count_op,) A tuple of aggregation operations to perform on each group. diff --git a/dbzero/dbzero/storage_api.py b/dbzero/dbzero/storage_api.py index 87745bc8..4305fdfb 100644 --- a/dbzero/dbzero/storage_api.py +++ b/dbzero/dbzero/storage_api.py @@ -9,7 +9,7 @@ PrefixMetaData = namedtuple("PrefixMetaData", ["name", "uuid"]) def get_prefixes() -> Iterator[PrefixMetaData]: - """Discover and return all available prefixes from the configured dbzero storage location. + """Return all prefixes accessible from the current context. Returns ------- @@ -39,7 +39,7 @@ def get_prefixes() -> Iterator[PrefixMetaData]: def get_mutable_prefixes() -> Iterator[PrefixMetaData]: - """Return all currently open prefixes that can be modified (read-write mode). + """Return prefixes currently open in read-write mode. Returns ------- diff --git a/src/dbzero/core/utils/hash_func.cpp b/src/dbzero/core/utils/hash_func.cpp index fbfcd385..1010b71e 100644 --- a/src/dbzero/core/utils/hash_func.cpp +++ b/src/dbzero/core/utils/hash_func.cpp @@ -7,6 +7,7 @@ namespace db0 { + // MurmurHash algorithm by Austin Appleby std::uint64_t murmurhash64A(const void* key, std::size_t len, std::uint64_t seed) { const std::uint64_t m = 0xc6a4a7935bd1e995ULL; From 7cd2ce8658ae1197b10a736618a00d8f3b2a54cf Mon Sep 17 00:00:00 2001 From: Tomasz Wejroch Date: Thu, 18 Dec 2025 09:09:29 +0100 Subject: [PATCH 2/2] Added 'exists' method to Snapshot interface --- dbzero/dbzero/interfaces.py | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/dbzero/dbzero/interfaces.py b/dbzero/dbzero/interfaces.py index cffd114c..c1db8907 100644 --- a/dbzero/dbzero/interfaces.py +++ b/dbzero/dbzero/interfaces.py @@ -201,6 +201,35 @@ def fetch(self, id: Union[str, type], type: Optional[type] = None, prefix: Optio ------- Memo The requested Memo object instance. + + Raises + ------ + Exception + If the object cannot be found or type validation fails. + """ + ... + + def exists(identifier: Union[str, type], expected_type: Optional[type] = None, prefix: Optional[str] = None) -> bool: + """Check if an identifier points to a valid dbzero object or an existing singleton instance + + Parameters + ---------- + identifier : str or type + The identifier for object to check for. + + * str: Check for object with its unique identifier + * type: Check for instance of this singleton type + expected_type : type, optional + Optional expected type when checking by UUID. + Verifies the found object is an instance of this type. + prefix : str, optional + Optional prefix name to search within. Defaults to currently active prefix. + Only used when checking singleton types. + + Returns + ------- + bool + True if the object exists (and matches type if specified), False otherwise. """ ...