Skip to content

Added a experimentCache to be used in createExperiment to check for existing experiments#1861

Open
ncau wants to merge 4 commits into
kruize:mvp_demofrom
ncau:mvp_demo
Open

Added a experimentCache to be used in createExperiment to check for existing experiments#1861
ncau wants to merge 4 commits into
kruize:mvp_demofrom
ncau:mvp_demo

Conversation

@ncau
Copy link
Copy Markdown
Contributor

@ncau ncau commented Apr 1, 2026

Original PR had to recreate due to wrong branch: #1782

Description

Added a experimentCache to be used in createExperiment to check for existing experiments.

Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Introduce an in-memory experiment cache to avoid redundant database lookups when creating and deleting experiments.

New Features:

  • Add an ExperimentCache utility to track existing experiment names using an in-memory set backed by on-demand database lookups.
  • Enforce conflict responses when attempting to create an experiment with a name that already exists.

Enhancements:

  • Reuse singleton Gson and ExperimentDBService instances within CreateExperiment to avoid repeated object creation.
  • Simplify experiment deletion by relying on the new cache instead of preloading experiments into an in-memory map.

Summary by Sourcery

Introduce an in-memory experiment cache to optimize experiment creation and enforce unique experiment names.

New Features:

  • Add an ExperimentCache utility to track and cache experiment names using database lookups on cache misses.
  • Validate experiment creation requests against the cache and return HTTP 409 when an experiment name already exists.

Enhancements:

  • Reuse a shared Gson instance in CreateExperiment instead of creating new instances for each request.
  • Cache experiment names after successful database writes and handle duplicate-name errors consistently in API responses.

Summary by Sourcery

Introduce an in-memory experiment cache to optimize experiment creation and enforce unique experiment names.

New Features:

  • Add an ExperimentCache utility to track experiment names backed by on-demand database lookups.
  • Use the experiment cache in the CreateExperiment servlet to prevent creation of experiments with duplicate names and return HTTP 409 conflicts.

Enhancements:

  • Reuse a shared Gson instance in the CreateExperiment servlet instead of creating a new instance per request.

Copilot AI review requested due to automatic review settings April 1, 2026 15:16
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 1, 2026

Reviewer's Guide

Introduces an in-memory ExperimentCache used by the CreateExperiment servlet to pre-check and track existing experiment names, returning HTTP 409 on duplicates and reusing a shared Gson instance for JSON handling.

Sequence diagram for experiment creation with ExperimentCache

sequenceDiagram
    actor Client
    participant CreateExperiment
    participant ExperimentCache
    participant ExperimentDBService
    participant Database

    Client->>CreateExperiment: HTTP POST /createExperiment (JSON)
    CreateExperiment->>CreateExperiment: parse JSON with gson
    CreateExperiment->>CreateExperiment: ServiceHelpers.checkForBulk()

    loop For each CreateExperimentAPIObject
        CreateExperiment->>ExperimentCache: isExists(experimentName)
        alt Name found in cache or DB
            ExperimentCache->>ExperimentCache: return true
            CreateExperiment->>Client: HTTP 409 Conflict (Experiment name already exists)
            CreateExperiment-->>Client: return
        else Name not found
            ExperimentCache->>ExperimentCache: return false
        end
    end

    CreateExperiment->>CreateExperiment: normalizeAndValidateExperimentTypes()
    CreateExperiment->>CreateExperiment: find invalidKruizeObject
    alt All KruizeObjects valid
        loop For each KruizeObject
            CreateExperiment->>ExperimentDBService: addExperimentToDB(validAPIObj)
            ExperimentDBService->>Database: insert experiment
            Database-->>ExperimentDBService: insert result
            ExperimentDBService-->>CreateExperiment: ValidationOutputData

            alt addedToDB.isSuccess()
                CreateExperiment->>ExperimentCache: add(experimentName)
            else DB indicates duplicate name
                CreateExperiment->>ExperimentCache: add(experimentName)
                CreateExperiment->>Client: HTTP 409 Conflict (Experiment name already exists)
                CreateExperiment-->>Client: return
            end
        end
        CreateExperiment->>Client: HTTP 201 Created (gson.toJson(KruizeResponse))
    else Some KruizeObject invalid
        CreateExperiment->>Client: HTTP error response (validation failure)
    end
Loading

Updated class diagram for ExperimentCache and CreateExperiment

classDiagram
    class CreateExperiment {
        <<HttpServlet>>
        - static long serialVersionUID
        - static Logger LOGGER
        - static ExperimentCache experimentCache
        - static Gson gson
        + init(config ServletConfig) void
        + doPost(request HttpServletRequest, response HttpServletResponse) void
        - sendSuccessResponse(response HttpServletResponse, message String) void
        - sendErrorResponse(inputData String, response HttpServletResponse, invalidKruizeObject KruizeObject, statusCode int, message String) void
    }

    class ExperimentCache {
        - static Logger LOGGER
        - Set~String~ experimentNamesCache
        + isExists(experiment_name String) boolean
        - checkExperimentInDatabase(experiment_name String) boolean
        + add(experiment_name String) void
        + remove(experiment_name String) void
    }

    class ExperimentDBService {
        + addExperimentToDB(validAPIObj CreateExperimentAPIObject) ValidationOutputData
        + loadExperimentFromDBByName(experimentMap Map~String, KruizeObject~, experiment_name String) void
        + loadLMExperimentFromDBByName(experimentMap Map~String, KruizeObject~, experiment_name String) void
    }

    class KruizeDeploymentInfo {
        + static boolean is_ros_enabled
    }

    CreateExperiment --> ExperimentCache : uses
    CreateExperiment --> ExperimentDBService : creates_and_calls
    ExperimentCache --> ExperimentDBService : uses
    ExperimentCache --> KruizeDeploymentInfo : reads_flag
Loading

File-Level Changes

Change Details Files
Add an in-memory experiment name cache and integrate it into experiment creation flow to avoid redundant DB lookups and enforce uniqueness.
  • Introduce ExperimentCache utility backed by a concurrent set with DB-backed cache miss handling based on KruizeDeploymentInfo.is_ros_enabled.
  • Add static ExperimentCache instance in CreateExperiment and use it to pre-check incoming experiment names before validation and DB write.
  • On successful experiment creation or duplicate-name DB error, update the cache and return HTTP 409 for conflicts, logging cache hits and misses for observability.
src/main/java/com/autotune/analyzer/utils/ExperimentCache.java
src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Refactor CreateExperiment servlet to reuse a shared Gson instance for JSON serialization/deserialization.
  • Declare a static Gson instance in CreateExperiment and use it for parsing the request body into CreateExperimentAPIObject list.
  • Use the shared Gson instance for serializing success responses instead of constructing new Gson objects.
src/main/java/com/autotune/analyzer/services/CreateExperiment.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@ncau ncau requested review from mbvreddy and removed request for Copilot April 1, 2026 15:17
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The duplicate-name detection in doPost currently relies on checking addedToDB.getMessage().contains("already exists"), which is brittle; consider exposing a structured status/enum or specific exception from ExperimentDBService so you can branch on a stable signal instead of parsing messages.
  • There are still multiple new ExperimentDBService() calls (per request and from ExperimentCache); if this service is stateless, consider reusing a shared instance (similar to Gson) to avoid repeated object creation and simplify dependency management, potentially via injection.
  • The current cache-based pre-check in ExperimentCache.isExists does not prevent race conditions where two concurrent requests create the same experiment name (both can miss the cache and DB); if uniqueness guarantees are important, keep relying on DB constraints and treat the cache as an optimization only, or add a per-name synchronization strategy.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The duplicate-name detection in `doPost` currently relies on checking `addedToDB.getMessage().contains("already exists")`, which is brittle; consider exposing a structured status/enum or specific exception from `ExperimentDBService` so you can branch on a stable signal instead of parsing messages.
- There are still multiple `new ExperimentDBService()` calls (per request and from `ExperimentCache`); if this service is stateless, consider reusing a shared instance (similar to `Gson`) to avoid repeated object creation and simplify dependency management, potentially via injection.
- The current cache-based pre-check in `ExperimentCache.isExists` does not prevent race conditions where two concurrent requests create the same experiment name (both can miss the cache and DB); if uniqueness guarantees are important, keep relying on DB constraints and treat the cache as an optimization only, or add a per-name synchronization strategy.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/analyzer/services/CreateExperiment.java" line_range="111-113" />
<code_context>
+            // Check if experiment already exists before processing
+            for (CreateExperimentAPIObject createExperimentAPIObject : createExperimentAPIObjects) {
+                String experimentName = createExperimentAPIObject.getExperimentName();
+                if (experimentCache.isExists(experimentName)) {
+                    LOGGER.debug("Experiment {} already exists, returning 409", experimentName);
+                    sendErrorResponse(inputData, response, null, HttpServletResponse.SC_CONFLICT, "Experiment name already exists");
+                    return;
+                }
</code_context>
<issue_to_address>
**issue (bug_risk):** The pre-flight uniqueness check can race with concurrent requests and may produce inconsistent behavior for bulk creates.

`experimentCache.isExists()` as a pre-insert gate can race: two concurrent requests can both see “not exists” and then both attempt inserts, leaving the DB constraint as the real arbiter (which you already handle via the `"already exists"` message). For bulk creates, this also aborts the entire batch if any one experiment exists. Consider relying on DB insert failures as the canonical existence check (using the cache only as a read-through optimization) and clarifying whether partial success for bulk requests is preferable to rejecting the whole batch on the first conflict.
</issue_to_address>

### Comment 2
<location path="src/main/java/com/autotune/analyzer/services/CreateExperiment.java" line_range="143-145" />
<code_context>
+                    
+                    if (addedToDB.isSuccess()) {
+                        experimentCache.add(ko.getExperimentName());
+                    } else {
+                        // Check if experiment already exists
+                        if (addedToDB.getMessage() != null && addedToDB.getMessage().contains("already exists")) {
+                            LOGGER.debug("Experiment {} already exists, returning 409", ko.getExperimentName());
+                            experimentCache.add(ko.getExperimentName());
</code_context>
<issue_to_address>
**issue (bug_risk):** Relying on a substring match in the DB error message for conflict detection is brittle.

This `addedToDB.getMessage().contains("already exists")` check couples behavior to a specific error string, which may change and break conflict handling without warning. Prefer a structured indicator from `ValidationOutputData` (e.g., error code/enum/flag for uniqueness violation). If that’s not available yet, at least centralize this string constant and document the contract with the DB layer.
</issue_to_address>

### Comment 3
<location path="src/main/java/com/autotune/analyzer/utils/ExperimentCache.java" line_range="84" />
<code_context>
+     * @return true if experiment exists in database, false otherwise
+     */
+    private boolean checkExperimentInDatabase(String experiment_name) {
+        Map<String, KruizeObject> experimentMap = new ConcurrentHashMap<>();
+        try {
+            ExperimentDBService experimentDBService = new ExperimentDBService();
</code_context>
<issue_to_address>
**suggestion:** Using ConcurrentHashMap for a local, single-threaded map introduces unnecessary overhead.

Since `experimentMap` is only used within `checkExperimentInDatabase` and never shared across threads, you can use a plain `HashMap` here for simpler, slightly more efficient code with equivalent behavior.

Suggested implementation:

```java
    private boolean checkExperimentInDatabase(String experiment_name) {
        Map<String, KruizeObject> experimentMap = new HashMap<>();

```

1. Ensure `java.util.HashMap` is imported at the top of the file:
   `import java.util.HashMap;`
2. If `java.util.concurrent.ConcurrentHashMap` is no longer used anywhere in this file, remove its import to avoid an unused import warning.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java Outdated
private boolean checkExperimentInDatabase(String experiment_name) {
Map<String, KruizeObject> experimentMap = new ConcurrentHashMap<>();
try {
ExperimentDBService experimentDBService = new ExperimentDBService();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this as instance variable so that it will be initialized once.

Copilot AI review requested due to automatic review settings April 6, 2026 18:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces an in-memory ExperimentCache and wires it into the CreateExperiment API to reduce database lookups and reject duplicate experiment names (HTTP 409).

Changes:

  • Added ExperimentCache utility backed by a concurrent set with DB fallback on cache miss.
  • Updated CreateExperiment to reuse a static Gson instance and to pre-check experiment-name uniqueness via the cache.
  • Added conflict (409) handling paths when a duplicate experiment name is detected.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/main/java/com/autotune/analyzer/utils/ExperimentCache.java Adds experiment-name cache with DB-backed existence checks.
src/main/java/com/autotune/analyzer/services/CreateExperiment.java Uses cache to detect duplicates before creation; reuses shared Gson for JSON serialization/deserialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java
Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/services/CreateExperiment.java
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java Outdated
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java Outdated
Comment thread src/main/java/com/autotune/analyzer/utils/ExperimentCache.java
@ncau ncau requested a review from mbvreddy April 7, 2026 14:16
@chandrams chandrams moved this from In Progress to Under Review in Monitoring Apr 9, 2026
@kusumachalasani kusumachalasani requested a review from dinogun April 9, 2026 11:36
@dinogun
Copy link
Copy Markdown
Contributor

dinogun commented Apr 10, 2026

@ncau how big is the cache? Have we tested this to see how much time is reduced per call vs the overhead. Would be good to have some numbers to understand the value of the cache here

@dinogun dinogun removed this from the Kruize 0.10.0 Release milestone Apr 10, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 30 days of inactivity. Please comment if this pull request is still relevant to keep it active.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Under Review

Development

Successfully merging this pull request may close these issues.

6 participants