Skip to content

[Connectors] Refactor YETI to API v2 payload and auth schemas#3736

Open
sanjib2006 wants to merge 5 commits into
gsoc-2026/connectorsfrom
gsoc-2026/refactor-yeti-v2-auth
Open

[Connectors] Refactor YETI to API v2 payload and auth schemas#3736
sanjib2006 wants to merge 5 commits into
gsoc-2026/connectorsfrom
gsoc-2026/refactor-yeti-v2-auth

Conversation

@sanjib2006
Copy link
Copy Markdown
Member

@sanjib2006 sanjib2006 commented Jun 2, 2026

Closes #3707

Description

  • Implemented YETI JWT authentication and access token handling.
  • Updated YETI API endpoints and request payload schemas to align with the current v2 API requirements.
  • Added unit tests covering :
    • Authentication failure scenario.
    • Missing access token in authentication response.
    • Successful file-sample submission and processing flow.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue).

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch gsoc-2026/connectors
  • I have inserted the copyright banner at the start of the file: # This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.
  • Linters (Ruff) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.

@sanjib2006
Copy link
Copy Markdown
Member Author

Screenshots:

image image


if self._url_key_name and self._url_key_name.endswith("/"):
self._url_key_name = self._url_key_name[:-1]
url = f"{self._url_key_name}/api/v2/observables/"
Copy link
Copy Markdown
Member Author

@sanjib2006 sanjib2006 Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • /api/v2/observables/ in current version of yeti this endpoint accepts tags, value and type only

Check here

# create observable with `obs_value` if it doesn't exists
# new context, tags, source are appended with existing ones

url = f"{self._url_key_name}/api/v2/observables/extended"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • /api/v2/observables/extended we can send extra context through this endpoint only

Check here

"observable": {
# there are type mismatches between YETI and IntelOwl
# so for now we are not senging the type to YETI
# "type": obs_type,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without providing the type it defaults to jarm. Check the screenshot.

@sanjib2006
Copy link
Copy Markdown
Member Author

Regarding the type:

These are the supported types in YETI.

I am not sure about how v1 was working but I have checked v2 properly.

  • If we do not send type in the payload then it defaults the type to jarm (I tried a file and an ip got same)
  • The current implementation has a obs_type var:
    def run(self):
    # get observable value and type
    if self._job.is_sample:
    obs_value = self._job.analyzable.md5
    obs_type = "file"
    else:
    obs_value = self._job.analyzable.name
    obs_type = self._job.analyzable.classification
  • But there are some mismatches .classification returns "ip" and not "ipv4" or "ipv6" but yeti does not recognize ip type (it recognizes ipv4 or ipv6)

    IntelOwl/api_app/choices.py

    Lines 105 to 112 in 38ebe10

    class Classification(models.TextChoices):
    IP = "ip"
    URL = "url"
    DOMAIN = "domain"
    HASH = "hash"
    GENERIC = "generic"
    FILE = "file"
  • So we have to do something for ip, domain and hash, otherwise will get errors for these unknown types.
  • Well YETI has a guess_type function which determines the observable type itself (by sending "type": "guess") but currently the endpoints I am working on i.e. .../ and .../extended/ do not call it(I have tested that also).

@sanjib2006
Copy link
Copy Markdown
Member Author

I will add the related tests and do something for the type tomorrow. I have provided links to the official source wherever needed because the YETI docs do not contain info about these things.

@sanjib2006 sanjib2006 linked an issue Jun 2, 2026 that may be closed by this pull request
"report": f"{settings.WEB_CLIENT_URL}/jobs/{self.job_id}",
"status": "analyzed",
"date": str(self._job.finished_analysis_time),
"date": str(self._job.received_request_time),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting NULL (check the first screenshot) probable reason is race condition so I replaced it with the received_request_time

@sanjib2006
Copy link
Copy Markdown
Member Author

Screenshots:

Tests

image

with type parameter

image image

obs_type = self._job.analyzable.classification

# convert obs_type to YETI's expected types if possible
if obs_type == "ip":
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three were conflicting

  • ip - converting it to ipv4 or ipv6 (generic if any problem)
  • domain to hostname
  • about hash there are multiple types so for now I am making it generic

"status": "analyzed",
"date": str(self._job.finished_analysis_time),
"date": str(self._job.received_request_time),
"description": f"IntelOwl's analysis report for Job: {self.job_id} | {obs_value} | {obs_type}",
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we changed obs_type to match the YETI, this context will also contain the YETI types and not Intelowl

@sanjib2006
Copy link
Copy Markdown
Member Author

Hello @mlodic!
This is done. Please have a look whenever you get time.

@sanjib2006 sanjib2006 marked this pull request as ready for review June 4, 2026 14:30
@sanjib2006 sanjib2006 requested a review from mlodic June 4, 2026 14:49
Copy link
Copy Markdown
Member

@mlodic mlodic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a question: why the IP address is classified as JARM in this case?


# auth
auth_url = f"{self._url_key_name}/api/v2/auth/api-token"
auth_headers = {"x-yeti-apikey": self._api_key_name}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both here and in the subsequent headers request we can customize it to add "IntelOwl" as user agent. This is common practice to help detecting integrations requests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the user agent in the latest commit


# create context
context = {
"source": "IntelOwl",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the source removed here too in favor of an user-agent string?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Without providing source this becomes blank (see 2nd and 3rd), similar in the observable page

@sanjib2006
Copy link
Copy Markdown
Member Author

a question: why the IP address is classified as JARM in this case?

I checked it, the intended response if type parameter is not provided is to raise an error but this endpoint is working without sending type because of a syntax bug

Check this line

The parameter here should be discriminator and not discriminant (spelling mistake) and this leads to trying all models and it just picks the one which validates first which in my case (the observable I tested ) matches JARM first which does not has any validator actually (anything can match to jarm in yeti currently)

I have raised an issue for this yeti-platform/yeti#1273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Connectors] - Fix Yeti API v2 Authentication

2 participants