Designing an effective schema is crucial for leveraging the full power of TelaMentis. This guide provides best practices for modeling your knowledge graph based on the current Phase 1 implementation.
Based on the current TelaMentis implementation:
- Nodes: Represent entities with
id_alias,label, andprops - TimeEdges: Represent bitemporal relationships with
valid_from/valid_to - Multi-Tenancy: All data is scoped to a
TenantId - Current Storage: Neo4j adapter with property-based tenant isolation
Current Implementation:
let person = Node::new("Person")
.with_id_alias("user_alice@example.com")
.with_property("name", json!("Alice"))
.with_property("email", json!("alice@example.com"));- Purpose: Use
id_aliasfor deterministic node identification across upsert operations - Current Behavior: The Neo4j adapter uses
MERGEoperations withid_aliasfor idempotency - Choosing an
id_alias:- Must be unique within a tenant for a given entity type
- Examples:
user_alice@example.com,product_sku_ABC123,document_hash_sha256 - Should be stable across application restarts
- Absence of
id_alias: Creates a new node on each upsert (useful for events or logs)
Current Implementation:
// Good examples
Node::new("Person") // Clear entity type
Node::new("Company") // Specific business entity
Node::new("Document") // Content type
Node::new("Event") // Temporal occurrence
// Avoid
Node::new("Object") // Too generic
Node::new("thing") // Inconsistent casing- Conventions in Phase 1:
- Use PascalCase for labels (e.g.,
UserProfile,SocialMediaPost) - Be consistent across your application
- Single label per node (multi-label support planned for Phase 2)
- Use PascalCase for labels (e.g.,
Current Implementation:
let user = Node::new("Person")
.with_id_alias("user_123")
.with_property("name", json!("Alice Wonderland"))
.with_property("email", json!("alice@example.com"))
.with_property("age", json!(30))
.with_property("created_at", json!("2023-01-15T10:00:00Z"))
.with_property("preferences", json!({
"theme": "dark",
"notifications": true
}));- Data Types: JSON values support strings, numbers, booleans, arrays, and objects
- Temporal Properties: Store as ISO8601 strings for consistency
- Nested Data: Use sparingly; prefer relationships for complex associations
- Indexing: Frequent query properties should be indexed (handled by Neo4j adapter)
Current Implementation:
// Good examples
TimeEdge::new(alice_id, acme_id, "WORKS_FOR", start_time, props)
TimeEdge::new(user_id, post_id, "AUTHORED", creation_time, props)
TimeEdge::new(person_id, location_id, "LIVES_IN", move_in_time, props)
// Naming conventions
"WORKS_FOR" // UPPER_SNAKE_CASE
"IS_PARENT_OF" // Clear directionality
"PURCHASED" // Past tense for completed actions
"KNOWS" // Present tense for ongoing relationships- Directionality: Edges have clear
from_node_id→to_node_iddirection - Granularity: Balance between too generic (
RELATED_TO) and too specific
Current Implementation:
use chrono::{DateTime, Utc};
// Ongoing relationship (valid_to = None)
let current_job = TimeEdge::new(
alice_id,
company_id,
"WORKS_FOR",
"2023-01-15T09:00:00Z".parse::<DateTime<Utc>>()?,
json!({"role": "Engineer", "department": "Backend"})
);
// Completed relationship
let former_job = TimeEdge::new(
alice_id,
old_company_id,
"WORKS_FOR",
"2022-01-01T09:00:00Z".parse()?,
json!({"role": "Junior Developer"})
).with_valid_to("2023-01-10T17:00:00Z".parse()?);Modeling Different Scenarios:
-
Events (instantaneous):
let login_event = TimeEdge::new( user_id, session_id, "LOGGED_IN", event_time, json!({"ip_address": "192.168.1.1"}) ).with_valid_to(event_time); // Same time = instantaneous
-
States (with duration):
let employment = TimeEdge::new( person_id, company_id, "EMPLOYED_AT", start_date, json!({"position": "Senior Engineer"}) ); // valid_to = None means currently employed
-
Historical Facts:
let birth = TimeEdge::new( person_id, location_id, "BORN_IN", birth_date, json!({"hospital": "General Hospital"}) ).with_valid_to(birth_date); // Instantaneous historical fact
Current Implementation:
let friendship = TimeEdge::new(
alice_id, bob_id, "KNOWS",
met_date,
json!({
"how_met": "college",
"closeness": "close_friend",
"last_contact": "2024-01-01T00:00:00Z"
})
);
let purchase = TimeEdge::new(
customer_id, product_id, "PURCHASED",
purchase_date,
json!({
"quantity": 2,
"unit_price": 29.99,
"currency": "USD",
"order_id": "ORD-12345"
})
);Current Implementation:
// Users
let alice = Node::new("Person")
.with_id_alias("user_alice")
.with_property("username", json!("alice_wonderland"))
.with_property("display_name", json!("Alice"));
// Messages
let message = Node::new("Message")
.with_id_alias("msg_12345")
.with_property("content", json!("Hello, world!"))
.with_property("platform", json!("twitter"));
// Relationships
let authored = TimeEdge::new(
alice_id, message_id, "AUTHORED",
post_time,
json!({"verified": true})
);
let reply = TimeEdge::new(
message_id, original_message_id, "REPLIES_TO",
reply_time,
json!({"thread_position": 2})
);Using the OpenAI Connector:
// Extract from text using LLM
let context = ExtractionContext {
messages: vec![LlmMessage {
role: "user".to_string(),
content: "Alice Wonderland works at Acme Corp as a Senior Engineer since January 2023.".to_string(),
}],
system_prompt: Some("Extract people, organizations, and relationships.".to_string()),
max_tokens: Some(1000),
temperature: Some(0.1),
desired_schema: None,
};
let envelope = openai_connector.extract(&tenant, context).await?;
// Process extracted entities
for node in envelope.nodes {
let node_obj = Node::new(&node.label)
.with_id_alias(&node.id_alias)
.with_props(node.props);
let node_id = graph_store.upsert_node(&tenant, node_obj).await?;
}
// Process extracted relationships
for relation in envelope.relations {
// Look up node IDs by alias
let from_id = graph_store.get_node_by_alias(&tenant, &relation.from_id_alias).await?;
let to_id = graph_store.get_node_by_alias(&tenant, &relation.to_id_alias).await?;
if let (Some((from_uuid, _)), Some((to_uuid, _))) = (from_id, to_id) {
let edge = TimeEdge::new(
from_uuid, to_uuid, &relation.type_label,
relation.valid_from.unwrap_or_else(Utc::now),
relation.props
);
if let Some(valid_to) = relation.valid_to {
edge = edge.with_valid_to(valid_to);
}
graph_store.upsert_edge(&tenant, edge).await?;
}
}Current Implementation:
// Organizations
let company = Node::new("Organization")
.with_id_alias("acme_corp")
.with_property("name", json!("Acme Corporation"))
.with_property("industry", json!("Technology"));
let department = Node::new("Department")
.with_id_alias("acme_engineering")
.with_property("name", json!("Engineering"))
.with_property("budget", json!(1000000));
// Hierarchical relationships
let dept_belongs = TimeEdge::new(
department_id, company_id, "BELONGS_TO",
dept_creation_date,
json!({"cost_center": "ENG001"})
);
let employment = TimeEdge::new(
person_id, department_id, "WORKS_IN",
hire_date,
json!({
"role": "Senior Engineer",
"salary_band": "L5",
"manager_id": "user_manager_bob"
})
);Modeling role changes over time:
// Alice's role evolution at the same company
let initial_role = TimeEdge::new(
alice_id, company_id, "HAS_ROLE",
"2023-01-15T09:00:00Z".parse()?,
json!({"title": "Junior Engineer", "level": "L3"})
).with_valid_to("2023-06-01T00:00:00Z".parse()?);
let promotion = TimeEdge::new(
alice_id, company_id, "HAS_ROLE",
"2023-06-01T00:00:00Z".parse()?,
json!({"title": "Senior Engineer", "level": "L5"})
); // valid_to = None (current role)Automatic Tenant Scoping:
// All operations are automatically scoped by tenant
let tenant_a = TenantId::new("company_a");
let tenant_b = TenantId::new("company_b");
// These are completely isolated
let alice_a = graph_store.upsert_node(&tenant_a, alice_node.clone()).await?;
let alice_b = graph_store.upsert_node(&tenant_b, alice_node.clone()).await?;
// Queries are automatically filtered
let nodes_a = graph_store.query(&tenant_a, find_people_query).await?; // Only tenant A data
let nodes_b = graph_store.query(&tenant_b, find_people_query).await?; // Only tenant B dataUnder the Hood (Neo4j Implementation):
- All nodes get
_tenant_idproperty automatically - All relationships get
_tenant_idproperty automatically - Queries are automatically filtered with
WHERE _tenant_id = $tenant_id
Using kgctl:
# Create tenant
kgctl tenant create company_a --name "Company A" --description "Production tenant"
# Import data to specific tenant
kgctl ingest csv --tenant company_a --file company_a_employees.csv
# Export tenant-specific data
kgctl export --tenant company_a --format graphml --output company_a_backup.xml
# Query tenant data
kgctl query nodes --tenant company_a --labels Person --limit 100Automatic Indexing: The Neo4j adapter automatically creates these indexes:
// Tenant isolation
CREATE INDEX tenant_node_idx FOR (n) ON (n._tenant_id)
CREATE INDEX tenant_rel_idx FOR ()-[r]-() ON (r._tenant_id)
// Node lookups
CREATE INDEX node_alias_idx FOR (n) ON (n.id_alias)
CREATE INDEX system_id_idx FOR (n) ON (n.system_id)
// Temporal queries
CREATE INDEX valid_from_idx FOR ()-[r]-() ON (r.valid_from)
CREATE INDEX valid_to_idx FOR ()-[r]-() ON (r.valid_to)Query Patterns:
// Efficient: Uses tenant + alias index
let node = graph_store.get_node_by_alias(&tenant, "user_alice").await?;
// Efficient: Uses tenant + label index
let query = GraphQuery::FindNodes {
labels: vec!["Person".to_string()],
properties: HashMap::new(),
limit: Some(100),
};
// Efficient: Uses temporal index
let current_relationships = GraphQuery::FindRelationships {
from_node_id: Some(alice_id),
to_node_id: None,
relationship_types: vec!["WORKS_FOR".to_string()],
valid_at: Some(Utc::now()), // Uses temporal index
limit: None,
};CSV Import Performance:
# Batch size affects performance
kgctl ingest csv --tenant my_tenant --file large_dataset.csv --batch-size 1000
# Process multiple files efficiently
kgctl ingest csv --tenant my_tenant --file nodes.csv --file relationships.csvSingle Label per Node:
- Current: One label per node
- Workaround: Use properties for additional categorization
- Phase 2: Multi-label support planned
Basic Temporal Queries:
- Current: Simple "as-of" queries
- Workaround: Use date range filters in properties
- Phase 2: Full Allen's Interval Algebra
Property-Only Tenant Isolation:
- Current: Property-based isolation only
- Phase 2: Database-level isolation planned
Multi-Label Workaround:
let person = Node::new("Person")
.with_id_alias("alice")
.with_property("additional_types", json!(["Employee", "Manager"]))
.with_property("primary_role", json!("Engineer"));Complex Temporal Queries:
// Current: Basic temporal filtering
let query = GraphQuery::FindRelationships {
relationship_types: vec!["WORKS_FOR".to_string()],
valid_at: Some("2023-06-01T00:00:00Z".parse()?),
// ...
};
// Workaround for range queries: Use properties
let edge_with_duration = TimeEdge::new(
from_id, to_id, "EMPLOYED",
start_time,
json!({
"start_date": "2023-01-01",
"end_date": "2023-12-31",
"duration_days": 365
})
);- Use meaningful
id_aliasvalues for all entities you'll reference - Keep
labelvalues consistent and descriptive - Store temporal information properly in
valid_from/valid_to - Use properties for filterable attributes
- Leverage automatic indexing by using standard query patterns
- Use batch operations for large datasets
- Consider data locality when designing relationships
- Always scope operations by tenant
- Plan tenant lifecycle management
- Use descriptive tenant IDs
- Be consistent with timezone handling (UTC)
- Use
Noneforvalid_toon ongoing relationships - Model events as instantaneous (same
valid_from/valid_to)
When Phase 2 features become available, current schemas will be forward-compatible:
- Multi-Label Support: Existing single labels will work seamlessly
- Advanced Temporal: Current
TimeEdgedata will support new query types - Additional Isolation: Property-based isolation will remain the default
- Transaction Time: Will be automatically tracked for new data
The modular design ensures that schema improvements in Phase 2 won't require data migration for Phase 1 schemas.
By following these guidelines, you'll create robust, performant knowledge graphs that take full advantage of TelaMentis's current capabilities while being ready for future enhancements.