grafeo-llamaindex

LlamaIndex PropertyGraphStore backed by GrafeoDB, an embedded graph database with native vector search.

Build knowledge graphs from documents, query them with GQL, and run vector similarity search, all in a single .db file. No servers, no infrastructure.

Install

uv add grafeo-llamaindex

Quickstart

from llama_index.core import PropertyGraphIndex, SimpleDirectoryReader
from grafeo_llamaindex import GrafeoPropertyGraphStore

documents = SimpleDirectoryReader("./data").load_data()

graph_store = GrafeoPropertyGraphStore(db_path="./knowledge_graph.db")

index = PropertyGraphIndex.from_documents(
    documents,
    property_graph_store=graph_store,
    embed_kg_nodes=True,
)

retriever = index.as_retriever(include_text=True)
nodes = retriever.retrieve("What are the key relationships?")

Features

Full PropertyGraphStore: all 8 abstract methods implemented (get, get_triplets, get_rel_map, upsert_nodes, upsert_relations, delete, structured_query, vector_query)
Structured + vector queries: supports_structured_queries = True and supports_vector_queries = True in a single store
Embedded database: no Docker, no cloud, no external services. Just uv add grafeo
Single-file persistence: the entire knowledge graph lives in one .db file
Native HNSW vector search: embeddings stored alongside graph nodes, no separate vector DB needed
Multi-language queries: GQL, Cypher, Gremlin, GraphQL, SPARQL and SQL/PGQ all supported
Built-in graph algorithms: PageRank, Louvain, shortest paths, centrality and 30+ more via graph_store.client.algorithms

API Reference

`GrafeoPropertyGraphStore`

from grafeo_llamaindex import GrafeoPropertyGraphStore

store = GrafeoPropertyGraphStore(
    db_path=None,                # str | None - path for persistent storage, None for in-memory
    embedding_dimensions=1536,   # int - vector dimensions for HNSW index
    embedding_metric="cosine",   # str - "cosine", "euclidean", "dot_product", or "manhattan"
    dedup_threshold=None,        # float | None - cosine similarity threshold for entity dedup
)

Properties:

store.client: access the underlying grafeo.GrafeoDB instance for direct queries and algorithms
store.supports_structured_queries: True
store.supports_vector_queries: True

Methods (PropertyGraphStore interface):

Method	Description
`upsert_nodes(nodes)`	Insert or update `EntityNode` / `ChunkNode` objects
`upsert_relations(relations)`	Insert edges between existing nodes
`get(properties, ids)`	Retrieve nodes by ID or property filter
`get_triplets(entity_names, relation_names, ids)`	Get `(source, relation, target)` triplets
`get_rel_map(graph_nodes, depth, ignore_rels)`	BFS traversal from seed nodes
`delete(entity_names, relation_names, ids)`	Remove nodes and/or edges
`structured_query(query)`	Execute raw GQL/Cypher (or Gremlin with `g.` prefix)
`vector_query(query)`	HNSW similarity search over node embeddings
`get_schema()` / `get_schema_str()`	Inspect graph labels, edge types, and properties
`persist(path)`	Save in-memory database to disk
`close()`	Close the database connection

Persistence

The entire knowledge graph lives in a single .db file. Pass db_path to store data on disk, or omit it for in-memory use.

from grafeo_llamaindex import GrafeoPropertyGraphStore

# Create and populate
store = GrafeoPropertyGraphStore(db_path="./my_graph.db")
# ... upsert nodes and relations ...
store.close()

# Reopen later with the same path
store = GrafeoPropertyGraphStore(db_path="./my_graph.db")
print(store.node_count, store.edge_count)  # data is still there

You can also save an in-memory store to disk:

store = GrafeoPropertyGraphStore()  # in-memory
# ... populate ...
store.persist("./snapshot.db")

Deduplication

When dedup_threshold is set, upsert_nodes checks whether an incoming EntityNode's embedding is similar enough to an existing node (same label) to merge them instead of creating a duplicate.

store = GrafeoPropertyGraphStore(
    dedup_threshold=0.95,  # cosine similarity threshold
    embedding_dimensions=1536,
)

Key behavior:

Threshold semantics: if cosine_similarity(new, existing) >= dedup_threshold, the new node merges into the existing one (properties are overwritten, the original created_at timestamp is preserved).
Label-scoped: dedup only compares nodes with the same label. A "Person" and a "Company" with identical embeddings are never merged.
ChunkNode excluded: ChunkNode objects are never deduplicated, only EntityNode.
Requires embedding: nodes without an embedding are never deduplicated.
Runtime toggle: you can set store.dedup_threshold = 0.9 at any time and it takes effect on the next upsert_nodes call.

Relation Upsert Behavior

upsert_relations silently skips relations whose source_id or target_id does not match any existing node (by name or LlamaIndex ID). A UserWarning is emitted for each skipped relation, so you can catch these with Python's warnings module if needed.

Comparison

	Neo4j	FalkorDB	Grafeo
Requires server	Yes	Yes	No (embedded)
Vector search	Plugin (5.x+)	Limited	Native HNSW
Graph algorithms	GDS plugin ($)	Built-in	Built-in (30+)
Query languages	Cypher	Cypher	GQL, Cypher, Gremlin, GraphQL, SPARQL, SQL/PGQ
Deployment	Docker/Cloud	Docker/Cloud	`uv add grafeo`
Persistence	Server-managed	Server-managed	Single `.db` file

Examples

See the examples/ directory:

mock_embedding_demo.py: full demo with hand-crafted embeddings, no API key required
basic_graph_rag.py: build a Property Graph Index from documents and query it (requires OpenAI API key)
hybrid_retrieval.py: structured queries + vector search + PageRank, all in one script

Development

uv sync                  # install deps
uv run pytest -v         # run tests
uv run ruff check .      # lint
uv run ruff format .     # format
uv run ty check          # type check

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
examples		examples
src/grafeo_llamaindex		src/grafeo_llamaindex
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

grafeo-llamaindex

Install

Quickstart

Features

API Reference

`GrafeoPropertyGraphStore`

Persistence

Deduplication

Relation Upsert Behavior

Comparison

Examples

Development

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

grafeo-llamaindex

Install

Quickstart

Features

API Reference

GrafeoPropertyGraphStore

Persistence

Deduplication

Relation Upsert Behavior

Comparison

Examples

Development

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GrafeoPropertyGraphStore`

Packages