feat: defer item tagging to an external agent (phase 2)#120
Open
jansitarski wants to merge 4 commits into
Open
feat: defer item tagging to an external agent (phase 2)#120jansitarski wants to merge 4 commits into
jansitarski wants to merge 4 commits into
Conversation
Add TaggingStatus (pending|tagged) and TaggedBy (auto|manual) enums with tagging_status/tagged_by/tagged_at columns on clothing_items, exposed on ItemResponse and filterable via ItemFilter. tagged_by records how the current tags were produced and is only ever set server-side: auto (internal AI worker) or manual (supplied through the API). Existing rows are backfilled as tagged/auto so they do not appear as pending external work.
…rface
With internal vision disabled (or auto_tag=false on upload), items skip the
tagging queue and are left ready + tagging_status=pending, immediately usable
while untagged. GET /items?tagging_status=pending is the external tagger's
work queue. A content-bearing PATCH on a pending item marks it tagged with a
server-derived manual origin — a one-way transition that never rewrites an
existing origin — and projects tag attributes onto their first-class columns
(parity with the worker's dual-write). POST /items/{id}/retag resets an item
to the queue. The internal worker stamps tagged/auto on successful auto-tag
and leaves skipped items pending.
The features block shipped false pending the write-back surface; the tagging surface now exists, so flip external_tagging on. Suggestions and pairings stay false until their authoring endpoints land.
AI-on and AI-off paths: the auto_tag / vision enqueue guards, the pending work-queue filter, write-back origin stamping and its one-way transition, origin forgery rejection, empty write-backs staying pending, tag-to-column projection, and the retag reset. Default behavior (vision on) is asserted unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is phase 2 of making the internal AI optional so the backend can run with internal generation disabled and defer work to an external agent (e.g. Claude via an MCP server). Phase 1 (#113, merged in v1.4.0) added the capability switches and
/capabilities; this phase makes item tagging a first-class, externally-ownable surface.Today tagging happens implicitly via the internal vision model, and there is no way to (a) leave an item untagged for something else to tag, or (b) record whether tags came from the machine or a person. This adds an explicit tagging lifecycle and a server-derived write origin:
clothing_items:tagging_status(pending|tagged),tagged_by(auto|manual),tagged_at, with native PG enum types. Existing rows are backfilled totagged/autoso nothing changes for current data.tagged/autoon success.POST /itemsgains anauto_tagflag. Every enqueue site (single create, bulk create, re-analyze) is vision-guarded: when internal vision is off (orauto_tag=false), the item is leftready+pendingfor an external tagger instead of queuing a no-op job.GET /items?tagging_status=pendingexposes the external tagger's work queue.PATCH /items/{id}that fills in a still-pendingitem's tags marks ittaggedwith a server-derived origin (manual). This is gated onpending, so it is a one-way transition and never re-stamps an already-tagged item, and it requires actual content — a PATCH carrying only empty/null tag values leaves the item pending. A tags write-back also projects its attributes onto their first-class columns (pattern,material,style,season,formality,colors,primary_color), keeping the column representation in sync with thetagsJSONB — parity with the internal worker, so externally-tagged items remain visible to column-based filters/scoring.POST /items/{id}/retagresets an item to the pending queue (clears origin, keeps tag content).GET /capabilitiesnow advertisesfeatures.external_tagging: true, per the contract established in phase 1 (flags flip in the PR that ships the write-back surface).external_suggestions/external_pairingsstayfalseuntil phase 3.Everything is additive and defaults to current behavior (internal vision on → items auto-tag exactly as before). The motivating consumers are external MCP servers that front this backend for an LLM; the design is provider-agnostic.
Related Issue
Related to #99; builds on #113 (phase 1)
Type of Change
Checklist
Testing
Test Environment
Tests Performed
backend/tests/test_item_tagging.py(14 tests): pending default + auto-tag worker origin;auto_tag/vision enqueue guards; pending work-queue filter; PATCH write-back origin; empty write-backs stay pending; no body forgery of origin; no re-stamp of an already-tagged item; tags→column projection; retag reset.349 passed).ruff checkandruff formatclean.Additional Notes
tagged_byis derived from the write path: the internal worker stampsauto; API write-backs stampmanual.tagging_status/tagged_by/tagged_atare intentionally absent from the writable schemas.tagged_byrecordsautovsmanualonly. An earlier iteration added a thirdagentorigin (signed JWTactorclaim + a shared-secret mint path); it was dropped because no feature consumes write provenance andtagged_bygrants no authority, so the unforgeability machinery wasn't worth the surface — and the backend cannot verify whether an API client is a human app or an agent anyway. The enum can gain a value later via an additive migration if a feature ever needs to trust provenance.ItemResponsenow includestagging_status,tagged_by, andtagged_at. No existing field changes shape.list_untagged_items,tagging_statusfilter),auto_tagon the create tools,retag_item, and the tag write-back were validated end-to-end against this branch.