fix: route AnalyzeText document errors to errorCol#2569
Conversation
Move Azure AI Language document-level errors returned inside HTTP 200 AnalyzeText responses from the response payload into the configured error column after auto-batch flattening. Preserve transport error precedence and add a no-network regression test for mixed document success/error responses. AB#4638662 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Hey @ranadeepsingh 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates the Azure AI Language AnalyzeText transformer to treat document-level errors returned inside HTTP 200 responses as per-document failures by routing them into the configured errorCol after auto-batch flattening, while preserving transport-error precedence. It also adds a regression test that exercises mixed success/error documents without making network calls.
Changes:
- Add a post-flatten pipeline step to move document-level
results.errorsentries intoerrorColand clear them from the per-document output. - Preserve precedence of existing transport errors by coalescing
errorColbefore routing document-level errors. - Add a no-network Scala test that validates mixed success/error document handling.
Show a summary per file
| File | Description |
|---|---|
| cognitive/src/main/scala/com/microsoft/azure/synapse/ml/services/language/AnalyzeText.scala | Adds a post-flatten Lambda stage to route document-level 200-response errors into errorCol and remove them from the output payload. |
| cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/language/AnalyzeTextSuite.scala | Adds a regression test using a custom handler to simulate mixed document success/error responses without network access. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
Use the sbt launcher version from project/build.properties instead of installing the latest apt sbt package. This keeps the JDK 11 PR validation job on the repository's sbt 1.10.11 launcher and avoids sbt 2.x rejecting JDK 11 before scalastyle can run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Invoke the downloaded sbt launcher explicitly so the GitHub runner does not resolve its preinstalled sbt 2.x binary under JDK 11. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep PR validation commands as plain sbt while placing the repository-version launcher first on PATH for subsequent workflow steps. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2569 +/- ##
==========================================
- Coverage 84.80% 84.78% -0.02%
==========================================
Files 334 334
Lines 17783 17801 +18
Branches 1632 1619 -13
==========================================
+ Hits 15081 15093 +12
- Misses 2702 2708 +6 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Partition collected rows by error nullability instead of relying on collect order, addressing PR review feedback about Spark DataFrames being unordered. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Move Azure AI Language document-level errors returned inside HTTP 200 AnalyzeText responses from the response payload into the configured error column after auto-batch flattening. Preserve transport error precedence and add a no-network regression test for mixed document success/error responses.
AB#4638662
Related Issues/PRs
#xxx
What changes are proposed in this pull request?
Briefly describe the changes included in this Pull Request.
How is this patch tested?
Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
website/docs/documentationfolder.Make sure you choose the correct class
estimators/transformersand namespace.DocTablepoints to correct API link.yarn run startto make sure the website renders correctly.<!--pytest-codeblocks:cont-->before each python code blocks to enable auto-tests for python samples.WebsiteSamplesTestsjob pass in the pipeline.