- Overview
- Validations Performed
- Validation Examples with Error Messages
- Best Practices
- Debug Configuration
The Elasticsearch SQL client integrates a compile-time validation system that automatically verifies your SQL queries are compatible with the Scala case classes used for deserialization. This validation detects errors before execution, ensuring consistency between your queries and your data model.
- ✅ Early error detection: Issues are identified at compile time, not in production
- ✅ Safe refactoring: Renaming or removing a field generates a compilation error
- ✅ Living documentation: Case classes document the data structure
- ✅ Strong typing: Guarantees consistency between SQL and Scala
- ✅ Explicit error messages: Clear guidance on how to fix issues
| Validation | Description | Level |
|---|---|---|
| SELECT * Rejection | Prohibits SELECT * to ensure compile-time validation |
❌ ERROR |
| Required Fields | Verifies that all required fields are selected | ❌ ERROR |
| Unknown Fields | Detects fields that don't exist in the case class | |
| Nested Objects | Validates the structure of nested objects | ❌ ERROR |
| Nested Collections | Validates the use of UNNEST for collections | ❌ ERROR |
| Type Compatibility | Checks compatibility between SQL and Scala types | ❌ ERROR |
case class Product(
id: String,
name: String,
price: Double
)
// ❌ ERROR: SELECT * is not allowed
client.searchAs[Product]("SELECT * FROM products")❌ SELECT * is not allowed with compile-time validation.
Query: SELECT * FROM products
Reason:
• Cannot validate field existence at compile-time
• Cannot validate type compatibility at compile-time
• Schema changes will break silently at runtime
Solution:
1. Explicitly list all required fields for Product:
SELECT id, name, price FROM ...
2. Use the *Unchecked() variant for dynamic queries:
searchAsUnchecked[Product](SQLQuery("SELECT * FROM ..."))
Best Practice:
Always explicitly select only the fields you need.
// ✅ CORRECT: Explicitly select all fields
client.searchAs[Product]("SELECT id, name, price FROM products")case class User(
id: String,
name: String,
email: String
)
// ❌ ERROR: The 'email' field is missing
client.searchAs[User]("SELECT id, name FROM users")❌ SQL query does not select the required field: email
Example query:
SELECT id, name, email FROM ...
To fix this, either:
1. Add it to the SELECT clause
2. Make it Option[T] in the case class
3. Provide a default value in the case class definition
Option 1: Add the missing field
// ✅ CORRECT
client.searchAs[User]("SELECT id, name, email FROM users")Option 2: Make the field optional
case class User(
id: String,
name: String,
email: Option[String] = None // ✅ Optional field
)
// ✅ CORRECT
client.searchAs[User]("SELECT id, name FROM users")Option 3: Provide a default value
case class User(
id: String,
name: String,
email: String = "" // ✅ Default value
)
// ✅ CORRECT
client.searchAs[User]("SELECT id, name FROM users")case class Product(
id: String,
name: String,
price: Double
)
// ❌ ERROR: Typo in 'price' -> 'pric'
client.searchAs[Product]("SELECT id, name, pric FROM products")❌ SQL query does not select the required field: price
You have selected unknown field "pric", did you mean "price"?
Example query:
SELECT id, name, price FROM ...
To fix this, either:
1. Add it to the SELECT clause
2. Make it Option[T] in the case class
3. Provide a default value in the case class definition
// ✅ CORRECT: Fix the typo
client.searchAs[Product]("SELECT id, name, price FROM products")case class User(
id: String,
name: String,
email: String
)
// ⚠️ WARNING: The 'age' field doesn't exist in User
client.searchAs[User]("SELECT id, name, email, age FROM users")⚠️ SQL query selects fields that don't exist in User:
age
Available fields: id, name, email
Note: These fields will be ignored during deserialization.
- ✅ The code compiles successfully
⚠️ A warning is displayed to inform about the unknown field- 🔄 During deserialization, the unknown field is silently ignored
- 📦 The JSON response contains the field, but it's not mapped to the case class
Option 1: Remove the unknown field
// ✅ CORRECT: Only select existing fields
client.searchAs[User]("SELECT id, name, email FROM users")Option 2: Add the field to the case class
case class User(
id: String,
name: String,
email: String,
age: Option[Int] = None // ✅ Field added
)
// ✅ CORRECT
client.searchAs[User]("SELECT id, name, email, age FROM users")case class Address(
street: String,
city: String,
country: String
)
case class User(
id: String,
name: String,
address: Address
)
// ❌ ERROR: Selecting nested fields without UNNEST
client.searchAs[User](
"SELECT id, name, address.street, address.city, address.country FROM users"
)❌ Nested object field 'address' cannot be deserialized correctly.
❌ Problem:
You are selecting nested fields individually:
address.street, address.city, address.country
Elasticsearch will return flat fields like:
{ "address.street": "value1", "address.city": "value2", "address.country": "value3" }
But Jackson needs a structured object like:
{ "address": {"street": "value1", "city": "value2", "country": "value3"} }
✅ Solution 1: Select the entire nested object (recommended)
SELECT address FROM ...
✅ Solution 2: Use UNNEST (if you need to filter or join on nested fields)
SELECT address.street, address.city, address.country
FROM ...
JOIN UNNEST(....address) AS address
📚 Note: This applies to ALL nested objects, not just collections.
Option 1: Select the complete object (RECOMMENDED)
// ✅ CORRECT: Select the entire object
client.searchAs[User]("SELECT id, name, address FROM users")Elasticsearch Response:
{
"id": "u1",
"name": "Alice",
"address": {
"street": "123 Main St",
"city": "Wonderland",
"country": "Fictionland"
}
}Option 2: Use UNNEST
// ✅ CORRECT: Use UNNEST for filtering/joining
client.searchAs[User](
"""SELECT id, name, address.street, address.city, address.country
FROM users
JOIN UNNEST(users.address) AS address
WHERE address.city = 'Wonderland'"""
)case class Address(
street: String,
city: String,
country: String
)
case class User(
id: String,
name: String,
address: Address // ❌ Required field not selected
)
// ❌ ERROR: The 'address' object is not selected
client.searchAs[User]("SELECT id, name FROM users")❌ SQL query does not select the required field: address
Example query:
SELECT id, name, address FROM ...
To fix this, either:
1. Add it to the SELECT clause
2. Make it Option[T] in the case class
3. Provide a default value in the case class definition
Option 1: Add the missing field
// ✅ CORRECT
client.searchAs[User]("SELECT id, name, address FROM users")Option 2: Make the object optional
case class User(
id: String,
name: String,
address: Option[Address] = None // ✅ Optional object
)
// ✅ CORRECT
client.searchAs[User]("SELECT id, name FROM users")case class Child(
name: String,
age: Int
)
case class Parent(
id: String,
name: String,
children: List[Child]
)
// ❌ ERROR: Selecting nested fields without UNNEST
client.searchAs[Parent](
"SELECT id, name, children.name, children.age FROM parent"
)❌ Collection field 'children' cannot be deserialized correctly.
❌ Problem:
You are selecting nested fields without using UNNEST:
children.name, children.age
Elasticsearch will return flat arrays like:
{ "children.name": ["Alice", "Bob"], "children.age": [10, 12] }
But Jackson needs structured objects like:
{ "children": [{"name": "Alice", "age": 10}, {"name": "Bob", "age": 12}] }
✅ Solution 1: Select the entire collection (recommended for simple queries)
SELECT children FROM ...
✅ Solution 2: Use UNNEST for precise field selection (recommended for complex queries)
SELECT children.name, children.age
FROM ...
JOIN UNNEST(....children) AS children
📚 Documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
Option 1: Select the complete collection (RECOMMENDED)
// ✅ CORRECT: Select the entire collection
client.searchAs[Parent]("SELECT id, name, children FROM parent")Elasticsearch Response:
{
"id": "p1",
"name": "Parent Name",
"children": [
{"name": "Alice", "age": 10},
{"name": "Bob", "age": 12}
]
}Option 2: Use UNNEST
// ✅ CORRECT: Use UNNEST for filtering/joining
client.searchAs[Parent](
"""SELECT id, name, children.name, children.age
FROM parent
JOIN UNNEST(parent.children) AS children
WHERE children.age > 10"""
)case class Child(
name: String,
age: Int
)
case class Parent(
id: String,
name: String,
children: List[Child] // ❌ Required collection not selected
)
// ❌ ERROR: The 'children' collection is not selected
client.searchAs[Parent]("SELECT id, name FROM parent")❌ SQL query does not select the required field: children
Example query:
SELECT id, name, children FROM ...
To fix this, either:
1. Add it to the SELECT clause
2. Make it Option[T] in the case class
3. Provide a default value in the case class definition
Option 1: Add the missing collection
// ✅ CORRECT
client.searchAs[Parent]("SELECT id, name, children FROM parent")Option 2: Make the collection optional
case class Parent(
id: String,
name: String,
children: Option[List[Child]] = None // ✅ Optional collection
)
// ✅ CORRECT
client.searchAs[Parent]("SELECT id, name FROM parent")case class Product(
id: String,
name: String,
stock: Int // ❌ Wrong type (should be Long)
)
// ❌ ERROR: The 'stock' field is cast to BIGINT in SQL
client.searchAs[Product]("SELECT id, name, stock::BIGINT FROM products")Type mismatch for field 'stock': SQL type BIGINT is incompatible with Scala type Int
Expected one of: Long, BigInt, Option[Long], Option[BigInt]
case class Product(
id: String,
name: String,
stock: Long // ✅ Correct type
)
// ✅ CORRECT
client.searchAs[Product]("SELECT id, name, stock::BIGINT FROM products")| SQL Type | Compatible Scala Types |
|---|---|
TINYINT |
Byte, Short, Int, Long, Option[Byte], Option[Short], Option[Int], Option[Long] |
SMALLINT |
Short, Int, Long, Option[Short], Option[Int], Option[Long] |
INT |
Int, Long, Option[Int], Option[Long] |
BIGINT |
Long, BigInt, Option[Long], Option[BigInt] |
DOUBLE, REAL |
Double, Float, Option[Double], Option[Float] |
VARCHAR |
String, Option[String] |
CHAR |
String, Char, Option[String], Option[Char] |
BOOLEAN |
Boolean, Option[Boolean] |
TIME |
java.time.LocalTime, java.time.Instant |
DATE |
java.time.LocalDate, java.time.Instant, java.util.Date |
DATETIME, TIMESTAMP |
java.time.LocalDateTime, java.time.ZonedDateTime, java.time.Instant |
STRUCT |
Case Class |
❌ Avoid:
client.searchAs[Product]("SELECT * FROM products")✅ Prefer:
client.searchAs[Product]("SELECT id, name, price FROM products")❌ Avoid:
client.searchAs[Parent]("SELECT id, children.name FROM parent")✅ Prefer:
// Option 1: Select the complete collection
client.searchAs[Parent]("SELECT id, children FROM parent")
// Option 2: Use UNNEST for filtering
client.searchAs[Parent](
"""SELECT id, children.name
FROM parent
JOIN UNNEST(parent.children) AS children"""
)✅ Simple Fields: Can be made optional if not required
case class User(
id: String,
name: String,
email: Option[String] = None // ✅ OK for simple fields
)// ❌ BAD PRACTICE: Making nested optional to avoid error
case class User(
id: String,
name: String,
address: Option[Address] = None // ❌ Avoid if 'address' is required
)
// ✅ GOOD PRACTICE: Fix the SQL query
client.searchAs[User]("SELECT id, name, address FROM users")✅ For Simple Fields:
case class Product(
id: String,
name: String,
price: Double = 0.0, // ✅ OK
inStock: Boolean = true // ✅ OK
)❌ For Nested Objects (avoid):
case class User(
id: String,
name: String,
address: Address = Address("", "", "") // ❌ Avoid
)// In build.sbt or command line
System.setProperty("elastic.sql.debug", "true")================================================================================
🔍 Starting SQL Query Validation
================================================================================
📝 Extracted SQL: SELECT id, name, address.street FROM users
🔍 Parsed fields: id, name, address.street
🔍 Unnested collections:
📋 Required fields for User (prefix=''): id, name, address
🔍 Checking field: id (type: String, optional: false, hasDefault: false)
✅ Field 'id' is directly selected
🔍 Checking field: name (type: String, optional: false, hasDefault: false)
✅ Field 'name' is directly selected
🔍 Checking field: address (type: Address, optional: false, hasDefault: false)
🏗️ Field 'address' is a nested object (non-collection)
❌ ERROR: Nested object field 'address' cannot be deserialized correctly.
| Rule | Behavior | Level |
|---|---|---|
| SELECT * | Prohibited | ❌ ERROR |
| Required field missing | Must be added, made optional, or have a default value | ❌ ERROR |
| Unknown field | Warning (ignored during deserialization) | |
| Nested object without UNNEST | Must select complete object or use UNNEST | ❌ ERROR |
| Nested collection without UNNEST | Must select complete collection or use UNNEST | ❌ ERROR |
| Type incompatibility | Must use a compatible Scala type | ❌ ERROR |
This compile-time validation ensures the robustness and maintainability of your code! 🚀✅