Added a experimentCache to be used in createExperiment to check for existing experiments#1861
Added a experimentCache to be used in createExperiment to check for existing experiments#1861ncau wants to merge 4 commits into
Conversation
Reviewer's GuideIntroduces an in-memory ExperimentCache used by the CreateExperiment servlet to pre-check and track existing experiment names, returning HTTP 409 on duplicates and reusing a shared Gson instance for JSON handling. Sequence diagram for experiment creation with ExperimentCachesequenceDiagram
actor Client
participant CreateExperiment
participant ExperimentCache
participant ExperimentDBService
participant Database
Client->>CreateExperiment: HTTP POST /createExperiment (JSON)
CreateExperiment->>CreateExperiment: parse JSON with gson
CreateExperiment->>CreateExperiment: ServiceHelpers.checkForBulk()
loop For each CreateExperimentAPIObject
CreateExperiment->>ExperimentCache: isExists(experimentName)
alt Name found in cache or DB
ExperimentCache->>ExperimentCache: return true
CreateExperiment->>Client: HTTP 409 Conflict (Experiment name already exists)
CreateExperiment-->>Client: return
else Name not found
ExperimentCache->>ExperimentCache: return false
end
end
CreateExperiment->>CreateExperiment: normalizeAndValidateExperimentTypes()
CreateExperiment->>CreateExperiment: find invalidKruizeObject
alt All KruizeObjects valid
loop For each KruizeObject
CreateExperiment->>ExperimentDBService: addExperimentToDB(validAPIObj)
ExperimentDBService->>Database: insert experiment
Database-->>ExperimentDBService: insert result
ExperimentDBService-->>CreateExperiment: ValidationOutputData
alt addedToDB.isSuccess()
CreateExperiment->>ExperimentCache: add(experimentName)
else DB indicates duplicate name
CreateExperiment->>ExperimentCache: add(experimentName)
CreateExperiment->>Client: HTTP 409 Conflict (Experiment name already exists)
CreateExperiment-->>Client: return
end
end
CreateExperiment->>Client: HTTP 201 Created (gson.toJson(KruizeResponse))
else Some KruizeObject invalid
CreateExperiment->>Client: HTTP error response (validation failure)
end
Updated class diagram for ExperimentCache and CreateExperimentclassDiagram
class CreateExperiment {
<<HttpServlet>>
- static long serialVersionUID
- static Logger LOGGER
- static ExperimentCache experimentCache
- static Gson gson
+ init(config ServletConfig) void
+ doPost(request HttpServletRequest, response HttpServletResponse) void
- sendSuccessResponse(response HttpServletResponse, message String) void
- sendErrorResponse(inputData String, response HttpServletResponse, invalidKruizeObject KruizeObject, statusCode int, message String) void
}
class ExperimentCache {
- static Logger LOGGER
- Set~String~ experimentNamesCache
+ isExists(experiment_name String) boolean
- checkExperimentInDatabase(experiment_name String) boolean
+ add(experiment_name String) void
+ remove(experiment_name String) void
}
class ExperimentDBService {
+ addExperimentToDB(validAPIObj CreateExperimentAPIObject) ValidationOutputData
+ loadExperimentFromDBByName(experimentMap Map~String, KruizeObject~, experiment_name String) void
+ loadLMExperimentFromDBByName(experimentMap Map~String, KruizeObject~, experiment_name String) void
}
class KruizeDeploymentInfo {
+ static boolean is_ros_enabled
}
CreateExperiment --> ExperimentCache : uses
CreateExperiment --> ExperimentDBService : creates_and_calls
ExperimentCache --> ExperimentDBService : uses
ExperimentCache --> KruizeDeploymentInfo : reads_flag
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The duplicate-name detection in
doPostcurrently relies on checkingaddedToDB.getMessage().contains("already exists"), which is brittle; consider exposing a structured status/enum or specific exception fromExperimentDBServiceso you can branch on a stable signal instead of parsing messages. - There are still multiple
new ExperimentDBService()calls (per request and fromExperimentCache); if this service is stateless, consider reusing a shared instance (similar toGson) to avoid repeated object creation and simplify dependency management, potentially via injection. - The current cache-based pre-check in
ExperimentCache.isExistsdoes not prevent race conditions where two concurrent requests create the same experiment name (both can miss the cache and DB); if uniqueness guarantees are important, keep relying on DB constraints and treat the cache as an optimization only, or add a per-name synchronization strategy.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The duplicate-name detection in `doPost` currently relies on checking `addedToDB.getMessage().contains("already exists")`, which is brittle; consider exposing a structured status/enum or specific exception from `ExperimentDBService` so you can branch on a stable signal instead of parsing messages.
- There are still multiple `new ExperimentDBService()` calls (per request and from `ExperimentCache`); if this service is stateless, consider reusing a shared instance (similar to `Gson`) to avoid repeated object creation and simplify dependency management, potentially via injection.
- The current cache-based pre-check in `ExperimentCache.isExists` does not prevent race conditions where two concurrent requests create the same experiment name (both can miss the cache and DB); if uniqueness guarantees are important, keep relying on DB constraints and treat the cache as an optimization only, or add a per-name synchronization strategy.
## Individual Comments
### Comment 1
<location path="src/main/java/com/autotune/analyzer/services/CreateExperiment.java" line_range="111-113" />
<code_context>
+ // Check if experiment already exists before processing
+ for (CreateExperimentAPIObject createExperimentAPIObject : createExperimentAPIObjects) {
+ String experimentName = createExperimentAPIObject.getExperimentName();
+ if (experimentCache.isExists(experimentName)) {
+ LOGGER.debug("Experiment {} already exists, returning 409", experimentName);
+ sendErrorResponse(inputData, response, null, HttpServletResponse.SC_CONFLICT, "Experiment name already exists");
+ return;
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** The pre-flight uniqueness check can race with concurrent requests and may produce inconsistent behavior for bulk creates.
`experimentCache.isExists()` as a pre-insert gate can race: two concurrent requests can both see “not exists” and then both attempt inserts, leaving the DB constraint as the real arbiter (which you already handle via the `"already exists"` message). For bulk creates, this also aborts the entire batch if any one experiment exists. Consider relying on DB insert failures as the canonical existence check (using the cache only as a read-through optimization) and clarifying whether partial success for bulk requests is preferable to rejecting the whole batch on the first conflict.
</issue_to_address>
### Comment 2
<location path="src/main/java/com/autotune/analyzer/services/CreateExperiment.java" line_range="143-145" />
<code_context>
+
+ if (addedToDB.isSuccess()) {
+ experimentCache.add(ko.getExperimentName());
+ } else {
+ // Check if experiment already exists
+ if (addedToDB.getMessage() != null && addedToDB.getMessage().contains("already exists")) {
+ LOGGER.debug("Experiment {} already exists, returning 409", ko.getExperimentName());
+ experimentCache.add(ko.getExperimentName());
</code_context>
<issue_to_address>
**issue (bug_risk):** Relying on a substring match in the DB error message for conflict detection is brittle.
This `addedToDB.getMessage().contains("already exists")` check couples behavior to a specific error string, which may change and break conflict handling without warning. Prefer a structured indicator from `ValidationOutputData` (e.g., error code/enum/flag for uniqueness violation). If that’s not available yet, at least centralize this string constant and document the contract with the DB layer.
</issue_to_address>
### Comment 3
<location path="src/main/java/com/autotune/analyzer/utils/ExperimentCache.java" line_range="84" />
<code_context>
+ * @return true if experiment exists in database, false otherwise
+ */
+ private boolean checkExperimentInDatabase(String experiment_name) {
+ Map<String, KruizeObject> experimentMap = new ConcurrentHashMap<>();
+ try {
+ ExperimentDBService experimentDBService = new ExperimentDBService();
</code_context>
<issue_to_address>
**suggestion:** Using ConcurrentHashMap for a local, single-threaded map introduces unnecessary overhead.
Since `experimentMap` is only used within `checkExperimentInDatabase` and never shared across threads, you can use a plain `HashMap` here for simpler, slightly more efficient code with equivalent behavior.
Suggested implementation:
```java
private boolean checkExperimentInDatabase(String experiment_name) {
Map<String, KruizeObject> experimentMap = new HashMap<>();
```
1. Ensure `java.util.HashMap` is imported at the top of the file:
`import java.util.HashMap;`
2. If `java.util.concurrent.ConcurrentHashMap` is no longer used anywhere in this file, remove its import to avoid an unused import warning.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| private boolean checkExperimentInDatabase(String experiment_name) { | ||
| Map<String, KruizeObject> experimentMap = new ConcurrentHashMap<>(); | ||
| try { | ||
| ExperimentDBService experimentDBService = new ExperimentDBService(); |
There was a problem hiding this comment.
Change this as instance variable so that it will be initialized once.
There was a problem hiding this comment.
Pull request overview
Introduces an in-memory ExperimentCache and wires it into the CreateExperiment API to reduce database lookups and reject duplicate experiment names (HTTP 409).
Changes:
- Added
ExperimentCacheutility backed by a concurrent set with DB fallback on cache miss. - Updated
CreateExperimentto reuse a staticGsoninstance and to pre-check experiment-name uniqueness via the cache. - Added conflict (409) handling paths when a duplicate experiment name is detected.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/main/java/com/autotune/analyzer/utils/ExperimentCache.java |
Adds experiment-name cache with DB-backed existence checks. |
src/main/java/com/autotune/analyzer/services/CreateExperiment.java |
Uses cache to detect duplicates before creation; reuses shared Gson for JSON serialization/deserialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@ncau how big is the cache? Have we tested this to see how much time is reduced per call vs the overhead. Would be good to have some numbers to understand the value of the cache here |
|
This pull request has been marked as stale due to 30 days of inactivity. Please comment if this pull request is still relevant to keep it active. |
Original PR had to recreate due to wrong branch: #1782
Description
Added a experimentCache to be used in createExperiment to check for existing experiments.
Fixes # (issue)
Type of change
How has this been tested?
Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.
Test Configuration
Checklist 🎯
Additional information
Include any additional information such as links, test results, screenshots here
Summary by Sourcery
Introduce an in-memory experiment cache to avoid redundant database lookups when creating and deleting experiments.
New Features:
Enhancements:
Summary by Sourcery
Introduce an in-memory experiment cache to optimize experiment creation and enforce unique experiment names.
New Features:
Enhancements:
Summary by Sourcery
Introduce an in-memory experiment cache to optimize experiment creation and enforce unique experiment names.
New Features:
Enhancements: