Bulk api label filtering for experiment type container only#1918
Bulk api label filtering for experiment type container only#1918pinkygupta-hub wants to merge 13 commits into
Conversation
Signed-off-by: Pinky Gupta <pinkygupta@Pinkys-MacBook-Pro.local>
Reviewer's GuideImplements label-based namespace and pod filtering for bulk metadata retrieval and experiment creation, wiring Bulk API label filters through to Prometheus queries and post-processing so that only matching resources (for experiment type container) produce experiments. Sequence diagram for Bulk API label-based metadata filteringsequenceDiagram
participant BulkJobManager
participant DataSourceManager
participant DataSourceMetadataOperator
participant DataSourceMetadataHelper
participant Prometheus
BulkJobManager->>BulkJobManager: buildResourceFilters(filter.getInclude())
BulkJobManager->>BulkJobManager: buildResourceFilters(filter.getExclude())
BulkJobManager->>DataSourceManager: importMetadataFromDataSource(metadataProfileName, datasource, null, startTime, endTime, steps, measurementDuration, includeResourcesMap, excludeResourcesMap)
DataSourceManager->>DataSourceMetadataOperator: processQueriesAndPopulateDataSourceMetadataInfo(metadataProfileName, dataSourceInfo, includeResources, excludeResources, startTime, endTime, steps, measurementDuration)
DataSourceMetadataOperator->>DataSourceMetadataOperator: getNamespacesMatchingLabelFilter(dataSourceInfo, metadataProfile, includeResources, startTime, endTime, steps, measurementDuration)
DataSourceMetadataOperator->>Prometheus: fetchQueryResults(dataSourceInfo, namespacesWithLabelFilterQuery, startTime, endTime, steps)
Prometheus-->>DataSourceMetadataOperator: JsonArray(namespaceQueryResultArray)
DataSourceMetadataOperator->>DataSourceMetadataOperator: getWorkloadsMatchingPodLabelFilter(dataSourceInfo, metadataProfile, includeResources, startTime, endTime, steps, measurementDuration)
DataSourceMetadataOperator->>Prometheus: fetchQueryResults(dataSourceInfo, workloadsWithPodLabelFilterQuery, startTime, endTime, steps)
Prometheus-->>DataSourceMetadataOperator: JsonArray(workloadQueryResultArray)
DataSourceMetadataOperator->>Prometheus: fetchQueryResults(dataSourceInfo, namespaceQuery, startTime, endTime, steps)
Prometheus-->>DataSourceMetadataOperator: JsonArray(namespacesDataResultArray)
DataSourceMetadataOperator->>DataSourceMetadataHelper: getActiveNamespaces(namespacesDataResultArray)
DataSourceMetadataHelper-->>DataSourceMetadataOperator: HashMap(namespaces)
DataSourceMetadataOperator->>Prometheus: fetchQueryResults(dataSourceInfo, workloadQuery, startTime, endTime, steps)
Prometheus-->>DataSourceMetadataOperator: JsonArray(workloadDataResultArray)
DataSourceMetadataOperator->>DataSourceMetadataHelper: getWorkloadInfo(workloadDataResultArray)
DataSourceMetadataHelper-->>DataSourceMetadataOperator: HashMap(namespaceWorkloadMap)
DataSourceMetadataOperator->>Prometheus: fetchQueryResults(dataSourceInfo, containerQuery, startTime, endTime, steps)
Prometheus-->>DataSourceMetadataOperator: JsonArray(containerDataResultArray)
DataSourceMetadataOperator->>DataSourceMetadataHelper: getContainerInfo(containerDataResultArray)
DataSourceMetadataHelper-->>DataSourceMetadataOperator: HashMap(workloadContainerMap)
DataSourceMetadataOperator->>DataSourceMetadataHelper: updateContainerDataSourceMetadataInfoObject(dataSourceName, dataSourceMetadataInfo, namespaceWorkloadMap, workloadContainerMap)
DataSourceMetadataOperator->>DataSourceMetadataHelper: filterMetadataInfoObject(dataSourceName, dataSourceMetadataInfo, matchedNamespaces, matchedWorkloads)
DataSourceMetadataHelper-->>DataSourceMetadataOperator: filtered DataSourceMetadataInfo
DataSourceMetadataOperator-->>DataSourceManager: DataSourceMetadataInfo
DataSourceManager-->>BulkJobManager: DataSourceMetadataInfo
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In
filterMetadataInfoObject, all errors are swallowed behind a generic log and the method becomes a no-op on any exception; consider tightening the try/catch scope and surfacing failures (or at least logging enough context) so that unexpected filtering issues are visible and debuggable. getMatchedNamespacesandgetMatchedWorkloadsboth construct a newDataSourceMetadataHelperinstance instead of reusing the existing helper and share very similar patterns (build query → fetch → map); you can reduce object churn and duplication by reusing the helper and extracting the common query/fetch/mapping logic.- The label key normalization and filter construction in
buildLabelFilters(key.replaceAll("[^a-zA-Z0-9_]", "_")andlabel_prefix) hard-code Prometheus metric label naming assumptions; it would be safer and easier to maintain if this transformation logic were centralized (and shared with any other kube_pod_labels users) rather than embedded here as string manipulation.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `filterMetadataInfoObject`, all errors are swallowed behind a generic log and the method becomes a no-op on any exception; consider tightening the try/catch scope and surfacing failures (or at least logging enough context) so that unexpected filtering issues are visible and debuggable.
- `getMatchedNamespaces` and `getMatchedWorkloads` both construct a new `DataSourceMetadataHelper` instance instead of reusing the existing helper and share very similar patterns (build query → fetch → map); you can reduce object churn and duplication by reusing the helper and extracting the common query/fetch/mapping logic.
- The label key normalization and filter construction in `buildLabelFilters` (`key.replaceAll("[^a-zA-Z0-9_]", "_")` and `label_` prefix) hard-code Prometheus metric label naming assumptions; it would be safer and easier to maintain if this transformation logic were centralized (and shared with any other kube_pod_labels users) rather than embedded here as string manipulation.
## Individual Comments
### Comment 1
<location path="src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java" line_range="596" />
<code_context>
+
+ labels.forEach((key, value) -> {
+ String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_");
+ String matcher = "label_" + normalizedKey + "=\"" + value + "\"";
+ podLabelBuilder.append(matcher).append(",");
+ namespaceLabelBuilder.append(matcher).append(",");
</code_context>
<issue_to_address>
**🚨 issue (security):** User-provided label values should be escaped/sanitized before embedding into PromQL matchers.
`labels` values appear to come from user input and are inserted directly into a PromQL selector. Unescaped quotes, backslashes, or PromQL metacharacters can break the query or allow query injection. Please escape `value` correctly (e.g., handle `"` and `\`) or use a dedicated helper to safely construct PromQL label matchers.
</issue_to_address>
### Comment 2
<location path="src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java" line_range="333" />
<code_context>
+
+ LOGGER.info("namespaceLabelFilterQuery: {}", namespaceQuery);
+ JsonArray namespaceQueryResultArray = fetchQueryResults(dataSourceInfo, namespaceQuery, startTime, endTime, steps);
+ return new DataSourceMetadataHelper().getActiveNamespaces(namespaceQueryResultArray);
+ }
+
</code_context>
<issue_to_address>
**suggestion (performance):** Avoid instantiating `DataSourceMetadataHelper` for each query and reuse the existing helper where possible.
Here and in `getMatchedWorkloads`, a new `DataSourceMetadataHelper` is created instead of using the existing `dataSourceDetailsHelper` instance. Even if the helper is stateless, this adds unnecessary allocations and diverges from the class’s existing pattern. Please reuse the existing (or a shared) instance for consistency and to avoid extra overhead.
</issue_to_address>
### Comment 3
<location path="docs/bulk-api-label-filtering-simple-guide.md" line_range="39" />
<code_context>
+ image: myapp:v1.0
+```
+
+**Key Point**: Labels are on the **POD**, not on namespace or workload!
+
+---
</code_context>
<issue_to_address>
**nitpick (typo):** Consider pluralizing 'namespace or workload' for grammatical clarity.
Since this refers to the concepts in general, consider: "Labels are on the POD, not on namespaces or workloads!"
```suggestion
**Key Point**: Labels are on the **POD**, not on namespaces or workloads!
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| labels.forEach((key, value) -> { | ||
| String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_"); | ||
| String matcher = "label_" + normalizedKey + "=\"" + value + "\""; |
There was a problem hiding this comment.
🚨 issue (security): User-provided label values should be escaped/sanitized before embedding into PromQL matchers.
labels values appear to come from user input and are inserted directly into a PromQL selector. Unescaped quotes, backslashes, or PromQL metacharacters can break the query or allow query injection. Please escape value correctly (e.g., handle " and \) or use a dedicated helper to safely construct PromQL label matchers.
|
|
||
| LOGGER.info("namespaceLabelFilterQuery: {}", namespaceQuery); | ||
| JsonArray namespaceQueryResultArray = fetchQueryResults(dataSourceInfo, namespaceQuery, startTime, endTime, steps); | ||
| return new DataSourceMetadataHelper().getActiveNamespaces(namespaceQueryResultArray); |
There was a problem hiding this comment.
suggestion (performance): Avoid instantiating DataSourceMetadataHelper for each query and reuse the existing helper where possible.
Here and in getMatchedWorkloads, a new DataSourceMetadataHelper is created instead of using the existing dataSourceDetailsHelper instance. Even if the helper is stateless, this adds unnecessary allocations and diverges from the class’s existing pattern. Please reuse the existing (or a shared) instance for consistency and to avoid extra overhead.
| image: myapp:v1.0 | ||
| ``` | ||
|
|
||
| **Key Point**: Labels are on the **POD**, not on namespace or workload! |
There was a problem hiding this comment.
nitpick (typo): Consider pluralizing 'namespace or workload' for grammatical clarity.
Since this refers to the concepts in general, consider: "Labels are on the POD, not on namespaces or workloads!"
| **Key Point**: Labels are on the **POD**, not on namespace or workload! | |
| **Key Point**: Labels are on the **POD**, not on namespaces or workloads! |
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The new label-filtering path logs quite heavily at INFO level (per-label, per-query, per-merge, plus experiment creation); consider downgrading most of these to DEBUG to avoid noisy logs and performance impact in large clusters.
- The semantics in
filterMetadataInfoObjectwhere a requested label filter that yields no matches results in no filtering (all resources kept) may be surprising for callers; it might be safer to distinguish between 'no metrics / query failure' and a genuine 'no matches' case and let the latter produce an empty selection. - In
buildLabelFilters, pod and namespace label filters are built identically by duplicating the matcher lists; you could simplify this by building the matcher list once and reusing it for bothpodLabelFilterandnamespaceLabelFilterto reduce code duplication.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new label-filtering path logs quite heavily at INFO level (per-label, per-query, per-merge, plus experiment creation); consider downgrading most of these to DEBUG to avoid noisy logs and performance impact in large clusters.
- The semantics in `filterMetadataInfoObject` where a requested label filter that yields no matches results in *no* filtering (all resources kept) may be surprising for callers; it might be safer to distinguish between 'no metrics / query failure' and a genuine 'no matches' case and let the latter produce an empty selection.
- In `buildLabelFilters`, pod and namespace label filters are built identically by duplicating the matcher lists; you could simplify this by building the matcher list once and reusing it for both `podLabelFilter` and `namespaceLabelFilter` to reduce code duplication.
## Individual Comments
### Comment 1
<location path="src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java" line_range="158-159" />
<code_context>
- includeResourcesMap = buildRegexFilters(this.bulkInput.getFilter().getInclude());
- excludeResourcesMap = buildRegexFilters(this.bulkInput.getFilter().getExclude());
+ labelString = getLabelsForExperimentName(this.bulkInput.getFilter());
+ includeResourcesMap = buildResourceFilters(this.bulkInput.getFilter().getInclude());
+ excludeResourcesMap = buildResourceFilters(this.bulkInput.getFilter().getExclude());
}
if (null == this.bulkInput.getDatasource()) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Exclude label filters are built but never used in label-based metadata filtering.
`DataSourceMetadataOperator` only uses `includeResources` (`namespaceLabelFilter`, `podLabelFilter`) when calling `getNamespacesMatchingLabelFilter` and `getWorkloadsMatchingPodLabelFilter`. With this change, `excludeResourcesMap` is also populated by `buildResourceFilters`, but those exclude label filters are never consumed, so user-provided exclude labels are silently ignored in metadata import. Please either wire the exclude label filters through to `DataSourceMetadataOperator` with defined semantics, or stop adding `podLabelFilter`/`namespaceLabelFilter` to `excludeResourcesMap` to avoid misleading configuration.
</issue_to_address>
### Comment 2
<location path="src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java" line_range="654-661" />
<code_context>
+ List<String> podLabelQueries = new ArrayList<>();
+ List<String> namespaceLabelQueries = new ArrayList<>();
+
+ labels.forEach((key, value) -> {
+ LOGGER.info("Processing label key: '{}' (original)", key);
+
+ // kube-state-metrics normalizes label names by replacing special chars with underscores
+ // e.g., "pod-template-hash" becomes "label_pod_template_hash"
+ // e.g., "app.kubernetes.io/component" becomes "label_app_kubernetes_io_component"
+ String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_");
+ LOGGER.info("Normalized key for PromQL: '{}'", normalizedKey);
+
+ // Handle both single string values and arrays of values
</code_context>
<issue_to_address>
**suggestion (performance):** High-volume INFO logging for each label key/value can be noisy in production.
These INFO logs in `buildLabelFilters`, `getLabelsForExperimentName`, and `escapePromQLLabelValue` run once per bulk request per label and can produce excessive log volume and I/O in large clusters. Since this is more diagnostic than operational, consider changing them to DEBUG or guarding them behind a debug flag.
Suggested implementation:
```java
labels.forEach((key, value) -> {
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Processing label key: '{}' (original)", key);
}
// kube-state-metrics normalizes label names by replacing special chars with underscores
// e.g., "pod-template-hash" becomes "label_pod_template_hash"
// e.g., "app.kubernetes.io/component" becomes "label_app_kubernetes_io_component"
String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_");
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Normalized key for PromQL: '{}'", normalizedKey);
}
```
1. In the same file (`BulkJobManager.java`), locate similar per-label INFO logs in the methods `getLabelsForExperimentName` and `escapePromQLLabelValue` (e.g., logs that print raw label keys/values or normalized values).
2. For each such log:
- Change `LOGGER.info(...)` to `LOGGER.debug(...)`.
- Wrap the call in an `if (LOGGER.isDebugEnabled()) { ... }` guard, as done above, to avoid unnecessary string formatting when debug is off.
3. Verify that there are no remaining high-volume per-label INFO logs in these methods, keeping INFO for truly operational/summary messages only (e.g., totals, high-level outcomes).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| includeResourcesMap = buildResourceFilters(this.bulkInput.getFilter().getInclude()); | ||
| excludeResourcesMap = buildResourceFilters(this.bulkInput.getFilter().getExclude()); |
There was a problem hiding this comment.
issue (bug_risk): Exclude label filters are built but never used in label-based metadata filtering.
DataSourceMetadataOperator only uses includeResources (namespaceLabelFilter, podLabelFilter) when calling getNamespacesMatchingLabelFilter and getWorkloadsMatchingPodLabelFilter. With this change, excludeResourcesMap is also populated by buildResourceFilters, but those exclude label filters are never consumed, so user-provided exclude labels are silently ignored in metadata import. Please either wire the exclude label filters through to DataSourceMetadataOperator with defined semantics, or stop adding podLabelFilter/namespaceLabelFilter to excludeResourcesMap to avoid misleading configuration.
| labels.forEach((key, value) -> { | ||
| LOGGER.info("Processing label key: '{}' (original)", key); | ||
|
|
||
| // kube-state-metrics normalizes label names by replacing special chars with underscores | ||
| // e.g., "pod-template-hash" becomes "label_pod_template_hash" | ||
| // e.g., "app.kubernetes.io/component" becomes "label_app_kubernetes_io_component" | ||
| String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_"); | ||
| LOGGER.info("Normalized key for PromQL: '{}'", normalizedKey); |
There was a problem hiding this comment.
suggestion (performance): High-volume INFO logging for each label key/value can be noisy in production.
These INFO logs in buildLabelFilters, getLabelsForExperimentName, and escapePromQLLabelValue run once per bulk request per label and can produce excessive log volume and I/O in large clusters. Since this is more diagnostic than operational, consider changing them to DEBUG or guarding them behind a debug flag.
Suggested implementation:
labels.forEach((key, value) -> {
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Processing label key: '{}' (original)", key);
}
// kube-state-metrics normalizes label names by replacing special chars with underscores
// e.g., "pod-template-hash" becomes "label_pod_template_hash"
// e.g., "app.kubernetes.io/component" becomes "label_app_kubernetes_io_component"
String normalizedKey = key.replaceAll("[^a-zA-Z0-9_]", "_");
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Normalized key for PromQL: '{}'", normalizedKey);
}- In the same file (
BulkJobManager.java), locate similar per-label INFO logs in the methodsgetLabelsForExperimentNameandescapePromQLLabelValue(e.g., logs that print raw label keys/values or normalized values). - For each such log:
- Change
LOGGER.info(...)toLOGGER.debug(...). - Wrap the call in an
if (LOGGER.isDebugEnabled()) { ... }guard, as done above, to avoid unnecessary string formatting when debug is off.
- Change
- Verify that there are no remaining high-volume per-label INFO logs in these methods, keeping INFO for truly operational/summary messages only (e.g., totals, high-level outcomes).
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Signed-off-by: Pinky Gupta <pinky.gupta1@ibm.com>
Description
Implement namespace and pod label selector filtering support in Bulk API resource selection and experiment creation flow. The implementation should ensure that experiments are created only for resources matching the provided labels for experiment type container only.
Namespace label selector filtering implemented
Pod label selector filtering implemented
Label-aware metadata retrieval integrated into Bulk API flow
Type of change
How has this been tested?
Manually tested - will raise separate PR for test cases
Test Configuration
Checklist 🎯
Additional information
Include any additional information such as links, test results, screenshots here
Summary by Sourcery
Introduce label-based filtering support into bulk metadata retrieval and document the design and usage for Bulk API label filtering.
New Features:
Enhancements:
Documentation:
Summary by Sourcery
Add label-based namespace and pod filtering to the bulk metadata and experiment creation flow so that only matching container workloads are selected for experiments.
New Features:
Enhancements:
Documentation: