Skip to content

Add Semantic Scholar no-auth chat tools app#7487

Open
liangtovi-debug wants to merge 2 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/semantic-scholar-chat-tools-3120
Open

Add Semantic Scholar no-auth chat tools app#7487
liangtovi-debug wants to merge 2 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/semantic-scholar-chat-tools-3120

Conversation

@liangtovi-debug
Copy link
Copy Markdown

/claim #3120

Summary

  • add a standalone plugins/omi-semantic-scholar-app integration with no OAuth/API key requirement
  • expose 3 chat tools:
    • search_semantic_scholar_papers
    • get_semantic_scholar_paper
    • get_semantic_scholar_author_papers
  • follow Omi chat-tool patterns:
    • JSON body input models via Pydantic
    • ChatToolResponse using result/error
    • manifest at /.well-known/omi-tools.json
  • include deploy/runtime files (requirements.txt, Procfile, railway.toml) and README

Validation

  • python3 -m py_compile plugins/omi-semantic-scholar-app/main.py plugins/omi-semantic-scholar-app/models.py

Notes

  • Uses public Semantic Scholar Graph API endpoints only (no credentials).
  • ID lookups URL-encode user-provided identifier path segments before requests.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 25, 2026

Greptile Summary

Adds a new standalone Omi plugin (plugins/omi-semantic-scholar-app) that exposes three no-auth chat tools — paper search, paper detail lookup, and author-papers lookup — backed by the public Semantic Scholar Graph API. The plugin follows FastAPI + Pydantic patterns used by other plugins in the repo.

  • normalize_identifier strips the doi: prefix before building the API path, so user-supplied DOIs like doi:10.1234/5678 arrive at Semantic Scholar as 10.1234%2F5678 (no DOI: prefix), which the API cannot resolve and returns a 404; additionally quote(..., safe=\"\") encodes the colon in any preserved prefix (e.g. arXiv:), compounding the issue.
  • Manifest uses plain-string parameter values instead of the JSON Schema shape (properties/type/description/required) that other plugins in this repo use, which may prevent Omi from correctly passing typed arguments to these tools.
  • ChatToolResponse allows both result and error to be None simultaneously with no validator enforcing that at least one is populated.

Confidence Score: 3/5

The paper-by-DOI lookup path is broken out of the box — every DOI-based request will return a 404 from Semantic Scholar — so one of the three advertised tools is non-functional as written.

The normalize_identifier function strips the doi: prefix that Semantic Scholar requires to distinguish a DOI from a native paper ID, and quote(..., safe="") would additionally encode any colon in other identifier schemes (e.g., arXiv:). Together these mean the get_semantic_scholar_paper tool silently fails for any non-native-ID input. The other two tools (search and author lookup) are functionally correct, but the manifest's non-standard parameter format may prevent the Omi platform from properly wiring tool arguments.

main.py — specifically normalize_identifier (lines 38–42) and the quote call in get_paper (line 124), and the manifest parameter schema (lines 57–81).

Important Files Changed

Filename Overview
plugins/omi-semantic-scholar-app/main.py Core app logic with two bugs: normalize_identifier strips the DOI prefix needed by Semantic Scholar, and quote(..., safe="") encodes colons, breaking all DOI-based paper lookups. Manifest also uses non-standard parameter format.
plugins/omi-semantic-scholar-app/models.py Pydantic models are well-structured; ChatToolResponse allows both result and error to be None simultaneously with no validation guard.
plugins/omi-semantic-scholar-app/requirements.txt Pinned dependency versions for fastapi, uvicorn, httpx, and pydantic; no issues found.
plugins/omi-semantic-scholar-app/Procfile Standard Heroku/Railway Procfile for uvicorn startup; no issues.
plugins/omi-semantic-scholar-app/railway.toml Railway deployment config with health check and restart policy; no issues.
plugins/omi-semantic-scholar-app/README.md Minimal README covering local run and manifest URL; no issues.

Sequence Diagram

sequenceDiagram
    participant Omi as Omi Platform
    participant App as omi-semantic-scholar-app (FastAPI)
    participant SS as Semantic Scholar Graph API

    Omi->>App: GET /.well-known/omi-tools.json
    App-->>Omi: Tool manifest (3 tools)

    Omi->>App: POST /tools/search_semantic_scholar_papers
    App->>SS: "GET /paper/search?query=...&limit=...&fields=..."
    SS-->>App: "{data: [...papers]}"
    App-->>Omi: "ChatToolResponse {result: formatted list}"

    Omi->>App: POST /tools/get_semantic_scholar_paper
    Note over App: normalize_identifier() strips doi: prefix
    Note over App: quote(..., safe="") encodes colon — DOI lookup breaks
    App->>SS: "GET /paper/{encoded_id}?fields=..."
    SS-->>App: paper detail or 404
    App-->>Omi: "ChatToolResponse {result or error}"

    Omi->>App: POST /tools/get_semantic_scholar_author_papers
    App->>SS: "GET /author/{encoded_id}?fields=name,papers.*"
    SS-->>App: "{name, papers: [...]}"
    App-->>Omi: "ChatToolResponse {result: sorted paper list}"
Loading

Reviews (1): Last reviewed commit: "Add Semantic Scholar no-auth chat tools ..." | Re-trigger Greptile

Comment on lines +38 to +42
def normalize_identifier(raw: str) -> str:
value = raw.strip()
if value.lower().startswith("doi:"):
value = value[4:]
return value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DOI prefix stripped and colon encoding — DOI lookups always 404

normalize_identifier removes the doi: prefix before the value is passed to the Semantic Scholar API. Semantic Scholar's paper lookup endpoint requires the DOI: prefix to recognize the identifier as a DOI (e.g., /paper/DOI:10.1234%2F5678); without it the raw value 10.1234%2F5678 is treated as an unknown paper ID and the API returns a 404.

Compounding this, quote(..., safe="") on line 124 encodes : to %3A, so even if the prefix were preserved, DOI%3A10.1234%2F5678 may not be matched by the server. The fix is to normalise the prefix to uppercase DOI: (instead of stripping it) and pass safe=":" to quote so that scheme separators are left intact.

Comment on lines +57 to +81
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"query": "string",
"max_results": "integer 1-10 (default 5)",
"min_year": "optional integer",
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {"paper_id_or_doi": "string"},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {"author_id": "string", "max_results": "integer 1-10"},
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-standard manifest parameter schema — Omi may not parse tool parameters correctly

Other plugins in this repo (e.g., omi-zomato-app) use a JSON Schema-shaped parameters object with properties, type, description, and required keys. The current manifest uses plain strings as values (e.g., "string", "integer 1-10 (default 5)"). If Omi's tool-calling layer expects JSON Schema, the LLM will not receive accurate type or constraint information for these tools.

Suggested change
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"query": "string",
"max_results": "integer 1-10 (default 5)",
"min_year": "optional integer",
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {"paper_id_or_doi": "string"},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {"author_id": "string", "max_results": "integer 1-10"},
},
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"properties": {
"query": {"type": "string", "description": "Keyword search query"},
"max_results": {"type": "integer", "description": "Number of results to return (1-10, default 5)"},
"min_year": {"type": "integer", "description": "Optional earliest publication year filter"},
},
"required": ["query"],
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {
"properties": {
"paper_id_or_doi": {"type": "string", "description": "Semantic Scholar paper ID or DOI (e.g. DOI:10.xxx/xxx)"},
},
"required": ["paper_id_or_doi"],
},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {
"properties": {
"author_id": {"type": "string", "description": "Semantic Scholar author ID"},
"max_results": {"type": "integer", "description": "Number of papers to return (1-10, default 5)"},
},
"required": ["author_id"],
},
},

Comment on lines +6 to +9
class ChatToolResponse(BaseModel):
"""Response model for Omi chat tool endpoints."""
result: Optional[str] = None
error: Optional[str] = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ChatToolResponse allows both fields to be None simultaneously

Both result and error default to None, so it is possible to construct — or accidentally return — ChatToolResponse() with {"result": null, "error": null}. Adding a model validator (e.g., @model_validator(mode="after")) that requires exactly one of the two fields to be set would make the contract explicit and prevent silent empty responses from reaching the Omi platform.

@liangtovi-debug
Copy link
Copy Markdown
Author

Addressed Greptile findings in c2099ae:

  1. Fixed DOI handling in get_semantic_scholar_paper:
  • Preserve DOI namespace as DOI:... in normalization.
  • Keep colon safe while URL-encoding path segment (quote(..., safe=":")) so DOI prefix survives and slash is encoded.
  1. Updated manifest parameter definitions to JSON Schema object format (type/properties/required) for all three tools, matching existing Omi plugin conventions.

  2. Added ChatToolResponse model validation to ensure at least one of result or error is present.

Validation:

  • python3 -m py_compile plugins/omi-semantic-scholar-app/main.py plugins/omi-semantic-scholar-app/models.py
  • Live Semantic Scholar DOI smoke call attempted, currently rate-limited by upstream (HTTP 429), so no deterministic live API assertion in CI here.

Copy link
Copy Markdown
Collaborator

@kodjima33 kodjima33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding the Semantic Scholar app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants