Classify supports sending classification results to Neo4j (graph database) and Elasticsearch (full-text search) via REST APIs.
- Neo4j: Stores graph structure with entities and relationships
- Elasticsearch: Stores fulltext metadata for search and analytics
Both integrations are optional and use standard HTTP REST APIs (no additional dependencies required).
Set these environment variables:
NEO4J_URL=http://localhost:7474
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=neo4j # optional, defaults to 'neo4j'Neo4j receives the graph structure with:
- Document node with metadata (title, domain, docType, source_file, classified_at)
- Entity nodes (extracted classes, functions, modules, etc.)
- Relationships between document and entities
CREATE (doc:Document {
source_file: "typescript/database.ts",
classified_at: datetime(),
id: "Database Module",
title: "Database Module",
domain: "software",
doc_type: "code_documentation"
})
CREATE (e0:Module {name: "pg"})
CREATE (doc)-[:MENTIONS]->(e0)POST http://localhost:7474/db/neo4j/tx/commit
Authorization: Basic base64(username:password)
Content-Type: application/json
{
"statements": [
{
"statement": "CREATE (doc:Document {...}) ..."
}
]
}
Set these environment variables:
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_INDEX=classify-documents # optional, defaultAuthentication (choose one):
# Option 1: Basic Auth
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your-password
# Option 2: API Key
ELASTICSEARCH_API_KEY=your-api-keyElasticsearch receives fulltext metadata:
{
"title": "Database Module",
"domain": "software",
"docType": "code_documentation",
"keywords": ["database", "postgresql", "connection"],
"summary": "Database connection and query utilities...",
"extractedFields": {
"language": "TypeScript",
"framework": "Node.js"
},
"sourceFile": "typescript/database.ts",
"classifiedAt": "2025-01-28T10:30:00.000Z",
"classification": {
"template": "software_project",
"confidence": 0.95
}
}The index is created automatically with this mapping:
{
"mappings": {
"properties": {
"title": { "type": "text" },
"domain": { "type": "keyword" },
"docType": { "type": "keyword" },
"keywords": { "type": "keyword" },
"summary": { "type": "text" },
"extractedFields": { "type": "object" },
"sourceFile": { "type": "keyword" },
"classifiedAt": { "type": "date" },
"classification": {
"properties": {
"template": { "type": "keyword" },
"confidence": { "type": "float" }
}
}
}
}
}Bulk Insert:
POST http://localhost:9200/_bulk?refresh=true
Content-Type: application/x-ndjson
{"index":{"_index":"classify-documents"}}
{"title":"...","domain":"..."}
{"index":{"_index":"classify-documents"}}
{"title":"...","domain":"..."}
import { Neo4jClient, ElasticsearchClient } from '@hivellm/classify';
// Initialize clients
const neo4j = new Neo4jClient({
url: 'http://localhost:7474',
username: 'neo4j',
password: 'password',
});
const elasticsearch = new ElasticsearchClient({
url: 'http://localhost:9200',
index: 'classify-documents',
});
await neo4j.initialize();
await elasticsearch.initialize();
// Insert single result
await neo4j.insertResult(classifyResult, 'path/to/file.ts');
await elasticsearch.insertResult(classifyResult, 'path/to/file.ts');
// Insert batch
await neo4j.insertBatch(results);
await elasticsearch.insertBatch(results);
// Cleanup
await neo4j.close();
await elasticsearch.close();The samples/scripts/classify-samples.ts script automatically detects and uses configured databases:
# Set environment variables
export NEO4J_URL=http://localhost:7474
export NEO4J_USERNAME=neo4j
export NEO4J_PASSWORD=password
export ELASTICSEARCH_URL=http://localhost:9200
export ELASTICSEARCH_USERNAME=elastic
export ELASTICSEARCH_PASSWORD=password
# Run script
npx tsx samples/scripts/classify-samples.tsOutput:
╔═══════════════════════════════════════════════════╗
║ Classifying Code Samples for ES + Neo4j ║
╚═══════════════════════════════════════════════════╝
🔵 Initializing Neo4j client...
✅ Connected to Neo4j
🟢 Initializing Elasticsearch client...
✅ Connected to Elasticsearch
✅ Created index: classify-documents
... classification progress ...
╔═══════════════════════════════════════════════════╗
║ Sending to Databases ║
╚═══════════════════════════════════════════════════╝
🔵 Sending to Neo4j...
✅ Inserted 20 documents into Neo4j
🟢 Sending to Elasticsearch...
✅ Indexed 20 documents in Elasticsearch
✅ Data sent to:
🔵 Neo4j: 20 documents
🟢 Elasticsearch: 20 documents
// Find all documents
MATCH (d:Document) RETURN d LIMIT 10
// Find documents by domain
MATCH (d:Document {domain: "software"}) RETURN d
// Find entities mentioned in a document
MATCH (d:Document {source_file: "typescript/database.ts"})-[:MENTIONS]->(e)
RETURN d, e
// Find documents mentioning a specific entity
MATCH (d:Document)-[:MENTIONS]->(e {name: "React"})
RETURN d.title, d.source_file# Search by text
curl -X POST "localhost:9200/classify-documents/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"summary": "database connection"
}
}
}'
# Filter by domain
curl -X POST "localhost:9200/classify-documents/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"term": {
"domain": "software"
}
}
}'
# Aggregate by docType
curl -X POST "localhost:9200/classify-documents/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"by_type": {
"terms": {
"field": "docType"
}
}
}
}'Both integrations gracefully handle failures:
- If database connection fails during initialization, a warning is shown and the script continues
- If insert fails, an error is logged but doesn't stop the classification process
- Connections are properly closed even if errors occur
docker run \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-d neo4j:latestdocker run \
--name elasticsearch \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
-d elasticsearch:8.11.0