[GSoC 2026] Add analyzer/playbook LLM tools to the chatbot agent (refs #3732) by berardifra · Pull Request #3740 · intelowlproject/IntelOwl

berardifra · 2026-06-04T13:09:09Z

Refs #3732. (Targets the gsoc-2026/llm-chatbot umbrella, not develop. Using Refs (not Closes) on purpose — the umbrella isn't develop, so the issue is closed manually after merge.)

Description

Adds two read-only tools to the chatbot LangChain ReAct agent (W5 — analyzer/playbook domain):

list_analyzers(observable_type="", limit=50) — lists the analyzers enabled for the requesting user
(globally not disabled, and not disabled for the user's organization), optionally filtered to those
supporting a given observable type. observable_type is validated against the Classification enum; an
unknown value is reported in errors instead of silently returning an empty list.
recommend_playbook(observable_name="", classification="", limit=50) — suggests directly-launchable
(starting=True, non-disabled) playbooks applicable to an observable's classification, scoped to the
playbooks visible to the user (visible_for_user — owned + organization-shared). The classification is
derived from the observable via Classification.calculate_observable when not supplied; an explicitly
provided one is validated against the enum.

Both follow the existing tool pattern (a make_*_tool(user) factory that closes over the user, so
multi-tenancy is enforced at build time and the LLM can't widen the scope) and return the standard
{"errors": [...], "<payload>": ...} envelope via ToolResultSerializer.to_json(). Output uses
purpose-built light serializers (AnalyzerConfigToolSerializer, PlaybookConfigToolSerializer) rather than
the heavy config serializers, to keep the LLM prompt small. Both cap their result list at 50
(_MAX_RESULTS) and clamp the LLM-supplied limit into [1, 50], so a single call can't flood the prompt.

These tools are read-only — no analysis is triggered. The analysis-launching tool analyze_observable
(with its safety guardrails) follows in a separate PR.

Type of change

New feature (non-breaking change which adds functionality).

Checklist

mlodic · 2026-06-05T14:43:47Z

I think the issues in the conflicts are due to the 2 parallels PRs that change the same code.

I merged the previous one without any change so it should be easy to fix the conflict.

mlodic · 2026-06-05T14:47:16Z

This can happen from time to time considering that I can't review the changes as soon as they are published and some parallel PRs are normal in such environments.

Try to consider this while doing the changes to avoid having to fix conflicts every time. Isolate the content as soon as possible, abuse creation of additional files (like serializers.py and test_tools.py could become folders and contain isolated tests in separated files). This should reduce the issues

…#3732) Add two read-only tools to the chatbot ReAct agent: - list_analyzers: lists the enabled observable analyzers, each annotated with a per-user 'runnable' readiness flag (no hard runnable filter, so key-based analyzers stay visible). - recommend_playbook: suggests launchable (starting=True, enabled) playbooks matching an observable's classification, scoped with visible_for_user. Both follow the make_*_tool(user) factory pattern and return the standard {errors, payload} envelope via purpose-built light serializers. Tests cover filtering, enum validation, the result cap, and org/visibility isolation.

mlodic · 2026-06-05T14:52:29Z

+
+# Observable analyzers never apply to the `file` classification (that's the file-analysis path),
+# so the accepted/advertised observable types are all classifications except FILE.
+_VALID_OBSERVABLE_TYPES = [c for c in Classification.values if c != Classification.FILE.value]


this information is an IntelOwl core info and should stay logically in the core part.
Please modify the Classification class and add an helper that generates this list and then import it here so it can be reused.

For instance, I noticed that in other parts of the code (see aggregate_observable_classification) we use the same list so it should be nice if you could modify that too while we are here.

mlodic · 2026-06-05T14:53:43Z

+    # near-empty lists and flaky tests. We list the enabled analyzers and surface readiness as a
+    # per-row flag instead, so the LLM can still say "this analyzer applies but isn't configured
+    # for you". `runnable` is False when the analyzer is disabled for the user's organization OR
+    # not fully configured/healthy.


thanks for the thorough explanation, that's really useful

mlodic · 2026-06-05T14:55:35Z

+                errors.append(f"Unknown observable_type '{observable_type}'; valid values are: {valid}.")
+
+        # Clamp the LLM-supplied limit into [1, _MAX_RESULTS] (treat tool args as untrusted).
+        limit = max(1, min(int(limit), _MAX_RESULTS))


should we explicitly raise that a clamp has been done here? I mean, with a comment, a "warning" or whatever

berardifra · 2026-06-05T15:04:29Z

This can happen from time to time considering that I can't review the changes as soon as they are published and some parallel PRs are normal in such environments.

Try to consider this while doing the changes to avoid having to fix conflicts every time. Isolate the content as soon as possible, abuse creation of additional files (like serializers.py and test_tools.py could become folders and contain isolated tests in separated files). This should reduce the issues

Yeah, I was expecting this one: two PRs touching the same files was a known risk on my side, so the rebase onto the umbrella was already part of the plan. Done now: additive conflicts only, kept both sides, 31/31 tests + Ruff green, mergeable again.
Good point on the isolation though -> I'll split serializers.py and test_tools.py into packages (one module per tool) so parallel PRs stop colliding.

…s + surface limit clamp (refs #3732) - Add Classification.observable_classifications() as the single source of truth for 'every classification except FILE'; reuse it in aggregate_observable_classification - Surface the limit clamp in the tool 'errors' (list_analyzers + recommend_playbook) so a truncated list is never silent - Tests for both clamp warnings

berardifra · 2026-06-05T17:57:51Z

Addressed in 761950b, thanks

berardifra added the gsoc-2026 GSoC 2026 - LLM Chatbot project (Francesco Berardi) label Jun 4, 2026

berardifra requested a review from mlodic June 4, 2026 13:09

github-project-automation Bot added this to GSoC '26: "Integrating a Self-Deployed LLM Chatbot for Threat Intelligence" Jun 4, 2026

github-project-automation Bot moved this to Todo in GSoC '26: "Integrating a Self-Deployed LLM Chatbot for Threat Intelligence" Jun 4, 2026

berardifra moved this from Todo to In Review in GSoC '26: "Integrating a Self-Deployed LLM Chatbot for Threat Intelligence" Jun 4, 2026

berardifra force-pushed the gsoc-2026/llm-chatbot-w5-analyzer-tools branch from e62f543 to 8eff4cc Compare June 5, 2026 11:53

berardifra force-pushed the gsoc-2026/llm-chatbot-w5-analyzer-tools branch from 8eff4cc to b2a7dbc Compare June 5, 2026 14:55

mlodic reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GSoC 2026] Add analyzer/playbook LLM tools to the chatbot agent (refs #3732)#3740

[GSoC 2026] Add analyzer/playbook LLM tools to the chatbot agent (refs #3732)#3740
berardifra wants to merge 2 commits into
gsoc-2026/llm-chatbotfrom
gsoc-2026/llm-chatbot-w5-analyzer-tools

berardifra commented Jun 4, 2026

Uh oh!

mlodic commented Jun 5, 2026 •

edited

Loading

Uh oh!

mlodic commented Jun 5, 2026

Uh oh!

mlodic Jun 5, 2026

Uh oh!

mlodic Jun 5, 2026

Uh oh!

mlodic Jun 5, 2026

Uh oh!

berardifra commented Jun 5, 2026

Uh oh!

berardifra commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

berardifra commented Jun 4, 2026

Description

Type of change

Checklist

Uh oh!

mlodic commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlodic commented Jun 5, 2026

Uh oh!

mlodic Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

mlodic Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

mlodic Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

berardifra commented Jun 5, 2026

Uh oh!

berardifra commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mlodic commented Jun 5, 2026 •

edited

Loading

berardifra commented Jun 5, 2026 •

edited

Loading