Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 24 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,43 @@

# HexTree

hextree provides tree structures that represent geographic regions
with [H3 cell]s.
HexTree provides tree structures for efficiently representing geographic
regions using [H3 cell]s. It takes advantage of H3's hierarchical structure
to automatically compact large regions and provide fast spatial queries.

The primary structures are:

- [**HexTreeMap**]: an H3 cell-to-value map.
- [**HexTreeSet**]: an H3 cell set for hit-testing.
- [**HexTreeSet**]: an H3 cell set for spatial containment testing.

You can think of `HexTreeMap` vs. `HexTreeSet` as [`HashMap`] vs. [`HashSet`].

## How is this different from `HashMap<H3Cell, V>`?

The key feature of a hextree is that its keys (H3 cells) are
hierarchical. For instance, if you previously inserted an entry for a
low-res cell, but later query for a higher-res child cell, the tree
returns the value for the lower res cell. Additionally, with
[compaction], trees can automatically coalesce adjacent high-res cells
into their parent cell. For very large regions, the compaction process
_can_ continue to lowest resolution cells (res-0), possibly removing
millions of redundant cells from the tree. For example, a set of
4,795,661 res-7 cells representing North America coalesces [into a
42,383 element `HexTreeSet`][us915].

A hextree's internal structure exactly matches the semantics of an [H3
cell]. The root of the tree has 122 resolution-0 nodes, followed by 15
levels of 7-ary nodes. The level of an occupied node, or leaf node, is
the same as its corresponding H3 cell resolution.
HexTree leverages H3's hierarchical cell structure in two key ways:

**Hierarchical Queries**: When you query for a cell, the tree returns
a value even if only a parent cell was inserted. For instance, if you
insert a low-res cell but later query for a higher-res child cell, the
tree returns the value from the parent.

**Automatic Compaction**: With [compaction], the tree can automatically
coalesce 7 adjacent child cells into their parent cell, dramatically
reducing memory usage. For very large regions, compaction can continue
recursively to the lowest resolution cells (res-0), possibly removing
millions of redundant cells. For example, 4,795,661 res-7 cells
representing North America compact [into just 42,383 elements][us915].

The internal structure mirrors H3's hierarchy: the root contains 122
resolution-0 base cells, with each level below being a 7-ary tree
(matching H3's 7 possible child cells per parent). The tree supports
up to 15 levels of resolution, where the depth of a leaf node corresponds
to its H3 cell resolution.

## Features

* **`serde`**: support for serialization via [serde].
* **`disktree`**: on-disk memory-mapped storage for large trees (enables `serde`, `byteorder`, and `memmap`).

## License

Expand Down
34 changes: 19 additions & 15 deletions src/cell.rs
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
//! This has two different types representing H3 indices is slightly
//! This has two different types representing H3 indices in slightly
//! different ways, [Index] & [Cell]. Index is lower level and allows
//! you create invalid H3 indices. Cell is higher level and enforces
//! you to create invalid H3 indices. Cell is higher level and enforces
//! invariants.

use crate::{Error, Result};
use std::{convert::TryFrom, fmt};

/// A low-level type for H3 [index manipulation].
///
/// Node that all setters take consume `self` and return a new
/// Note that all setters consume `self` and return a new
/// `Index`.
///
/// [index manipulation]: https://observablehq.com/@nrabinowitz/h3-index-bit-layout?collection=@nrabinowitz/h3
Expand Down Expand Up @@ -112,7 +112,7 @@ impl Index {
}
}

/// Consumes `self` and returns a new Index with it's resolution
/// Consumes `self` and returns a new Index with its resolution
/// `res` digit set to `digit`.
///
/// This function does not check `res` nor `digit` for validity
Expand All @@ -129,7 +129,10 @@ impl Index {
}
}

/// [HexTreeMap][crate::HexTreeMap]'s key type.
/// A validated H3 cell index.
///
/// This is the key type for [HexTreeMap][crate::HexTreeMap]. A `Cell`
/// is guaranteed to be a valid H3 cell (mode 1 index).
#[derive(Clone, Copy, Eq, Hash, PartialEq)]
#[cfg_attr(
feature = "serde",
Expand All @@ -153,7 +156,7 @@ impl Cell {
if
// reserved must be 0
!idx.reserved() &&
// we only care about mode 1 (cell) indicies
// we only care about mode 1 (cell) indices
idx.mode() == 1 &&
// there are only 122 base cells
idx.base() < 122
Expand All @@ -172,8 +175,9 @@ impl Cell {

/// Returns this cell's parent at the specified resolution.
///
/// Returns Some if `res` is less-than or equal-to this cell's
/// resolution, otherwise returns None.
/// Returns `Some` if `res` is less than or equal to this cell's
/// resolution. Returns `None` if `res` is greater than this cell's
/// resolution (you cannot get a higher-resolution parent).
#[inline]
pub const fn to_parent(&self, res: u8) -> Option<Self> {
match self.res() {
Expand Down Expand Up @@ -203,12 +207,12 @@ impl Cell {
Index(self.0).res()
}

/// Returns true if `self` is related to `other`.
/// Returns `true` if this cell is related to another cell.
///
/// "Related" can be any of the following:
/// - `self` == `other`
/// - `self` is a parent cell of `other`
/// - `other` is a parent cell of `self`
/// Two cells are related if they share a parent-child relationship:
/// - `self` and `other` are the same cell, or
/// - `self` is an ancestor (parent, grandparent, etc.) of `other`, or
/// - `other` is an ancestor of `self`
#[inline]
pub fn is_related_to(&self, other: &Self) -> bool {
let common_res = std::cmp::min(self.res(), other.res());
Expand Down Expand Up @@ -238,7 +242,7 @@ impl TryFrom<i64> for Cell {
}
}

/// A type for building up Cells in an iterative matter when
/// A type for building up Cells in an iterative manner when
/// tree-walking.
pub(crate) struct CellStack(Option<Cell>);

Expand Down Expand Up @@ -282,7 +286,7 @@ impl CellStack {
}
}

/// If self currency contains a cell, this replaces the digit at
/// If self currently contains a cell, this replaces the digit at
/// its current res and returns what was there. If self is empty,
/// nothing is replaced and None is returned.
pub fn swap(&mut self, digit: u8) -> Option<u8> {
Expand Down
30 changes: 22 additions & 8 deletions src/compaction.rs
Original file line number Diff line number Diff line change
@@ -1,21 +1,27 @@
//! User pluggable compaction.
//! User-pluggable compaction strategies.
//!
//! Compaction allows the tree to automatically coalesce child cells into
//! their parent when certain conditions are met, reducing memory usage
//! and improving query performance.

use crate::Cell;

/// A user provided compactor.
/// A user-provided compactor.
///
/// The compactor trait allows you customize compaction behavior after
/// The compactor trait allows you to customize compaction behavior after
/// calling `insert` on a tree.
pub trait Compactor<V> {
/// Called after every insert into a non-leaf node.
///
/// Given an intermediate (not-leaf) node's cell and up to 7
/// Given an intermediate (non-leaf) node's cell and up to 7
/// children, you can choose to leave the node alone by returning
/// `None`, or turn it into a leaf-node by return `Some(value)`.
/// `None`, or turn it into a leaf node by returning `Some(value)`.
fn compact(&mut self, cell: Cell, children: [Option<&V>; 7]) -> Option<V>;
}

/// Does not perform any compaction.
/// A compactor that performs no compaction.
///
/// This is the default compactor and leaves all inserted cells as-is.
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct NullCompactor;
Expand All @@ -26,7 +32,11 @@ impl<V> Compactor<V> for NullCompactor {
}
}

/// Compacts when all children are complete.
/// A compactor that coalesces nodes when all 7 children are present.
///
/// This is typically used with `HexTreeSet` (where values are `()`).
/// When all 7 children of a node are complete, they are replaced with
/// a single parent cell.
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct SetCompactor;
Expand All @@ -41,7 +51,11 @@ impl Compactor<()> for SetCompactor {
}
}

/// Compacts when all children are complete and have the same value.
/// A compactor that coalesces nodes when all 7 children have equal values.
///
/// When all 7 children of a node are present and have the same value,
/// they are replaced with a single parent cell containing that value.
/// This is useful for maps where large contiguous regions share the same value.
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct EqCompactor;
Expand Down
14 changes: 9 additions & 5 deletions src/disktree/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
//! An on-disk hextree.
//! On-disk memory-mapped storage for HexTree.
//!
//! DiskTree provides a serialized, memory-mapped representation of a HexTreeMap,
//! allowing you to store and query very large trees without loading them entirely
//! into memory.

#[cfg(not(target_pointer_width = "64"))]
compile_warning!("disktree may silently fail on non-64bit systems");
Expand Down Expand Up @@ -36,7 +40,7 @@ mod tests {
}

// Construct map with a compactor that automatically combines
// cells with the same save value.
// cells with the same value.
let mut monaco = HexTreeMap::with_compactor(EqCompactor);

// Now extend the map with cells and a region value.
Expand Down Expand Up @@ -190,7 +194,7 @@ mod tests {
}

// Construct map with a compactor that automatically combines
// cells with the same save value.
// cells with the same value.
let mut monaco = HexTreeMap::new();

// Now extend the map with cells and a region value.
Expand All @@ -204,7 +208,7 @@ mod tests {
.unwrap();
let monaco_disktree = DiskTreeMap::open(path).unwrap();

// Create the iterator with the user-defined deserialzer.
// Create the iterator with the user-defined deserializer.
let disktree_iter = monaco_disktree.iter().unwrap();
let start = std::time::Instant::now();
let mut disktree_collection = Vec::new();
Expand Down Expand Up @@ -294,7 +298,7 @@ mod tests {
assert_eq!(
leaf_vec.len(),
1,
"Iterator must have extactly one element for a leaf"
"Iterator must have exactly one element for a leaf"
);
assert_eq!(hextree_leaf, leaf_vec[0].0);
}
Expand Down
5 changes: 4 additions & 1 deletion src/disktree/tree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,10 @@ use std::{
pub(crate) const HDR_MAGIC: &[u8] = b"hextree\0";
pub(crate) const HDR_SZ: usize = HDR_MAGIC.len() + 1;

/// An on-disk hextree map.
/// A memory-mapped, on-disk HexTreeMap.
///
/// This structure provides read-only access to a HexTreeMap that has
/// been serialized to disk.
pub struct DiskTreeMap(pub(crate) Box<dyn AsRef<[u8]> + Send + Sync + 'static>);

impl DiskTreeMap {
Expand Down
49 changes: 23 additions & 26 deletions src/hex_tree_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ use std::{cmp::PartialEq, iter::FromIterator};
/// }
///
/// // Construct map with a compactor that automatically combines
/// // cells with the same save value.
/// // cells with the same value.
/// let mut monaco = HexTreeMap::with_compactor(EqCompactor);
///
/// // Now extend the map with cells and a region value.
Expand All @@ -66,7 +66,7 @@ use std::{cmp::PartialEq, iter::FromIterator};
pub struct HexTreeMap<V, C = NullCompactor> {
/// All h3 0 base cell indices in the tree
pub(crate) nodes: Box<[Option<Box<Node<V>>>]>,
/// User-provided compator. Defaults to the null compactor.
/// User-provided compactor. Defaults to the null compactor.
compactor: C,
}

Expand Down Expand Up @@ -121,7 +121,7 @@ impl<V, C> HexTreeMap<V, C> {
/// `self`.
///
/// This method is useful if you want to use one compaction
/// strategy for creating an initial, then another one for updates
/// strategy for creating an initial tree, then another one for updates
/// later.
pub fn replace_compactor<NewC>(self, new_compactor: NewC) -> HexTreeMap<V, NewC> {
HexTreeMap {
Expand All @@ -130,12 +130,11 @@ impl<V, C> HexTreeMap<V, C> {
}
}

/// Returns the number of H3 cells in the set.
/// Returns the number of H3 cells in the map.
///
/// This method only considers complete, or leaf, cells in the
/// set. Due to automatic compaction, this number may be
/// significantly smaller than the number of source cells used to
/// create the set.
/// This method only counts leaf cells (complete entries) in the
/// map. Due to automatic compaction, this number may be
/// significantly smaller than the number of cells originally inserted.
pub fn len(&self) -> usize {
self.nodes.iter().flatten().map(|node| node.len()).sum()
}
Expand All @@ -145,17 +144,15 @@ impl<V, C> HexTreeMap<V, C> {
self.len() == 0
}

/// Returns `true` if the set fully contains `cell`.
/// Returns `true` if the map fully contains `cell`.
///
/// This method will return `true` if any of the following are
/// true:
/// This method returns `true` if any of the following are true:
///
/// 1. There was an earlier [insert][Self::insert] call with
/// precisely this target cell.
/// 2. Several previously inserted cells coalesced into
/// precisely this target cell.
/// 3. The set contains a complete (leaf) parent of this target
/// cell due to 1 or 2.
/// 1. This exact cell was previously inserted.
/// 2. Several previously inserted cells were compacted into
/// this cell as their parent.
/// 3. The map contains a parent of this cell (due to 1 or 2),
/// meaning this cell inherits its parent's value.
pub fn contains(&self, cell: Cell) -> bool {
let base_cell = cell.base();
match self.nodes[base_cell as usize].as_ref() {
Expand All @@ -167,11 +164,11 @@ impl<V, C> HexTreeMap<V, C> {
}
}

/// Returns a reference to the value corresponding to the given
/// target cell or one of its parents.
/// Returns a reference to the value for the given cell or its nearest parent.
///
/// Note that this method also returns a Cell, which may be a
/// parent of the target cell provided.
/// Returns `Some((cell, value))` where `cell` is either the queried cell
/// or a parent cell that contains it. Returns `None` if no matching cell
/// or parent is found.
#[inline]
pub fn get(&self, cell: Cell) -> Option<(Cell, &V)> {
match self.get_raw(cell) {
Expand All @@ -192,11 +189,11 @@ impl<V, C> HexTreeMap<V, C> {
}
}

/// Returns a mutable reference to the value corresponding to the
/// given target cell or one of its parents.
/// Returns a mutable reference to the value for the given cell or its nearest parent.
///
/// Note that this method also returns a Cell, which may be a
/// parent of the target cell provided.
/// Returns `Some((cell, value))` where `cell` is either the queried cell
/// or a parent cell that contains it. Returns `None` if no matching cell
/// or parent is found.
#[inline]
pub fn get_mut(&mut self, cell: Cell) -> Option<(Cell, &mut V)> {
match self.get_raw_mut(cell) {
Expand Down Expand Up @@ -242,7 +239,7 @@ impl<V, C> HexTreeMap<V, C> {
crate::iteration::IterMut::new(&mut self.nodes, CellStack::new())
}

/// An iterator visiting the specified cell or its children
/// An iterator visiting the specified cell or its children with
/// references to the values.
pub fn descendants(&self, cell: Cell) -> impl Iterator<Item = (Cell, &V)> {
let base_cell = cell.base();
Expand Down
2 changes: 1 addition & 1 deletion src/hex_tree_set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use crate::{compaction::SetCompactor, Cell, HexTreeMap};
use std::iter::FromIterator;

/// A HexTreeSet is a structure for representing geographical regions
/// and efficiently testing performing hit-tests on that region. Or,
/// and efficiently performing hit-tests on that region. Or,
/// in other words: I have a region defined; does it contain this
/// point on earth?
///
Expand Down
Loading