Make analytics-engine an optional dependency by bowenlan-amzn · Pull Request #5403 · opensearch-project/sql

bowenlan-amzn · 2026-05-04T04:47:02Z

Summary

Makes the analytics-engine plugin an optional dependency of opensearch-sql. Installing opensearch-sql on a distribution without analytics-engine now succeeds; all PPL/SQL queries route through the in-plugin v2 Calcite engine (Lucene-backed). When analytics-engine is installed, the existing Mustang path (parquet-backed indices → QueryPlanExecutor) continues to work.

Approach

extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine;optional=true']
Move QueryPlanExecutor from a constructor parameter to an @Inject(optional = true) setter. Guice invokes the setter only when the binding exists; absent analytics-engine it's silently skipped and unifiedQueryHandler stays null. PPL routing then falls through to the v2 Calcite-to-OpenSearch path unconditionally.
Promote analytics-framework.jar from compileOnly to api so QueryPlanExecutor/SchemaProvider resolve from sql's own classpath when analytics-engine is absent. analytics-engine.jar stays compileOnly — its classes (e.g. OpenSearchSchemaBuilder) are reached only through RestUnifiedQueryAction, which never loads when the setter is skipped.
Drop the bundlePlugin { exclude '*.jar' } block. OpenSearch's jar-hell check skips the extended-plugin cross-check for optional deps (PluginsService.java:763), so sql can bundle every transitive jar it needs to run standalone. When analytics-engine is installed, parent-first classloader delegation still causes analytics-engine's copies to win for any shared class; sql's bundled copies sit idle. No class-identity drift.

Addressing the diff analyzer findings

All three flagged items are intentional and follow documented OpenSearch plugin mechanisms:

;optional=true syntax — parsed in PluginInfo.java:229 ("optional=true".equals(dependency[1])). Used in production elsewhere in the OpenSearch plugin ecosystem. Not ad-hoc syntax.
api promotion for analytics-framework — required so that sql's own classloader can resolve QueryPlanExecutor/SchemaProvider when analytics-engine is absent. When it IS present, the class-identity question is resolved by parent-first delegation: analytics-engine supplies the parent classloader (via extendedPlugins), and its copy of every framework class wins. sql's bundled jar is dead code in that scenario.
Removed exclude block — the excluded jars existed only to avoid jar-hell errors when sql and analytics-engine both shipped the same transitive (guava, calcite, jackson, etc.). With ;optional=true, PluginsService.checkBundleJarHell skips that check (PluginsService.java:763). sql bundling them is now safe. Security-patched platform copies are not shadowed — sql ships its own copies only; nothing is overwritten on disk or in the classpath search path.

Validation

Tested on a live 2-node cluster in both configurations:

With analytics-engine installed:

Legacy PPL (source=regular_test) → returns rows via Lucene path ✓
Analytics PPL (source=parquet_test, 1 shard, parquet-backed) → returns 20 rows via Mustang path ✓
SQL queries unaffected ✓

Without analytics-engine (only opensearch-job-scheduler + opensearch-sql):

Plugin install succeeds (no "Missing plugin" error) ✓
Cluster starts cleanly to green ✓
Legacy PPL against Lucene index → returns rows ✓
SQL against Lucene index → returns rows ✓
parquet_-prefixed lookup → clean IndexNotFoundException, not NPE or NoClassDefFoundError ✓

Not in this PR

The routing heuristic (table name prefix parquet_) is unchanged. A proper setting-based or data-format-based trigger is a separate discussion.
No changes on the analytics-engine side or in core OpenSearch. The mechanism uses only existing platform features.

github-actions · 2026-05-04T04:48:08Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit e271ff6.

Path	Line	Severity	Description
plugin/build.gradle	164	high	Dependency scope changed from 'compileOnly' to 'api' for analytics-framework-3.7.0-SNAPSHOT.jar. This promotes a previously compile-only local artifact to a transitive API dependency, altering the plugin's dependency graph and class loading at runtime. Mandatory flag: dependency configuration change.
plugin/build.gradle	58	high	extendedPlugins declaration changed from 'analytics-engine' to 'analytics-engine;optional=true'. This modifies how the plugin loader resolves and links the analytics-engine plugin at startup, potentially altering the shared classloader chain. Mandatory flag: build plugin/dependency configuration change.
plugin/build.gradle	60	medium	The entire bundlePlugin exclusion block (20+ jar exclusions preventing jar hell with analytics-engine's classloader) has been removed. Without these exclusions, the SQL plugin bundle may now include duplicate classes previously provided by analytics-engine, risking ClassLoader conflicts or silently shadowing classes at runtime. This warrants verification that the optional-dependency refactoring fully accounts for the prior deduplication logic.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 2 | Medium: 1 | Low: 0

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@Inject

sql plugin previously declared analytics-engine as a hard dependency: extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine'] Installing opensearch-sql on a distribution that doesn't ship analytics-engine failed with: Missing plugin [analytics-engine], dependency of [opensearch-sql] Marking the dep ;optional=true alone is not enough — TransportPPLQueryAction Guice-injects QueryPlanExecutor on its constructor, and Guice's OpenSearch fork rejects a required constructor parameter whose binding is missing at injector-build time ("constructors cannot be optional"). Move QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter. Guice invokes the setter only when a binding for QueryPlanExecutor<RelNode, Iterable<Object[]>> exists — i.e. when analytics-engine's createGuiceModules has run and bound DefaultPlanExecutor. Absent analytics-engine, the setter is silently skipped, unifiedQueryHandler stays null, and all PPL queries route to the v2 Calcite-to-OpenSearch path already in the sql plugin. Drop the bundlePlugin exclude list. OpenSearch's jar-hell check skips the extended-plugin cross-check when the dep is marked optional (PluginsService.java:763), so sql can bundle every jar it needs to run self-sufficiently. When analytics-engine is installed, parent-first classloader delegation still lets analytics-engine's copies win for any shared class; sql's bundled copies sit idle. Promote analytics-framework.jar from compileOnly to api so QueryPlanExecutor is reachable from sql's own classloader when the plugin is absent. analytics-engine.jar stays compileOnly (required only for OpenSearchSchemaBuilder, which lives in the engine plugin and is reached only through RestUnifiedQueryAction — a class that never loads when the setter is skipped). Validated on a live 2-node cluster in both configurations: - With analytics-engine installed: legacy and analytics PPL both return expected rows; routing to the analytics path still fires for parquet_-prefixed indices. - Without analytics-engine (only opensearch-job-scheduler + opensearch-sql installed): cluster starts cleanly, PPL and SQL queries against Lucene indices return expected rows, parquet_-prefixed lookups return a clean IndexNotFoundException instead of a NullPointerException or NoClassDefFoundError. Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>

mch2 · 2026-05-04T17:02:10Z

+   * Absent analytics-engine, Guice silently skips the call; {@link #unifiedQueryHandler}
+   * stays null and all PPL routing falls through to the legacy engine.
+   */
+  @Inject(optional = true)


does this work?

bowenlan-amzn marked this pull request as ready for review May 4, 2026 05:23

bowenlan-amzn requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, ahkcs, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners May 4, 2026 05:23

mch2 reviewed May 4, 2026

View reviewed changes

bowenlan-amzn force-pushed the analytics-optional branch from 280d3b4 to e271ff6 Compare May 4, 2026 17:42

bowenlan-amzn added the enhancement New feature or request label May 4, 2026

ahkcs approved these changes May 4, 2026

View reviewed changes

ahkcs mentioned this pull request May 6, 2026

Exclude jsr305 from SQL bundle to fix jar hell with arrow-flight-rpc #5409

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make analytics-engine an optional dependency #5403

Make analytics-engine an optional dependency #5403
bowenlan-amzn wants to merge 1 commit intoopensearch-project:feature/mustang-ppl-integrationfrom
bowenlan-amzn:analytics-optional

bowenlan-amzn commented May 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 4, 2026 •

edited

Loading

Uh oh!

mch2 May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bowenlan-amzn commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Addressing the diff analyzer findings

Validation

Not in this PR

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Analyzer ❗

Uh oh!

mch2 May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bowenlan-amzn commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading