Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions plugins/omi-semantic-scholar-app/Procfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
web: uvicorn main:app --host 0.0.0.0 --port $PORT
22 changes: 22 additions & 0 deletions plugins/omi-semantic-scholar-app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Omi Semantic Scholar App

A standalone no-auth Omi integration app that provides chat tools for discovering academic papers from Semantic Scholar.

## Tools

- `search_semantic_scholar_papers`: Search papers by keyword.
- `get_semantic_scholar_paper`: Fetch paper details by Semantic Scholar paper ID or DOI.
- `get_semantic_scholar_author_papers`: Get recent papers by author ID.

## Local Run

```bash
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8080
```

## Manifest

The Omi tools manifest is served at:

- `/.well-known/omi-tools.json`
235 changes: 235 additions & 0 deletions plugins/omi-semantic-scholar-app/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
"""Semantic Scholar no-auth chat tools app for Omi."""
from __future__ import annotations

from typing import Any, Dict, List
from urllib.parse import quote

import httpx
from fastapi import FastAPI

from models import (
ChatToolResponse,
GetAuthorPapersRequest,
GetPaperRequest,
SearchPapersRequest,
)

API_BASE = "https://api.semanticscholar.org/graph/v1"
TIMEOUT = 20

app = FastAPI(
title="Semantic Scholar Omi Integration",
description="No-auth Semantic Scholar chat tools for Omi",
version="1.0.0",
)


def format_authors(authors: List[Dict[str, Any]]) -> str:
names = [a.get("name", "Unknown") for a in authors if a.get("name")]
return ", ".join(names[:6]) if names else "Unknown"


def format_year(year: Any) -> str:
if isinstance(year, int):
return str(year)
return "Unknown"


def normalize_identifier(raw: str) -> str:
value = raw.strip()
if value.lower().startswith("doi:"):
# Preserve DOI namespace expected by Semantic Scholar.
value = "DOI:" + value[4:].strip()
return value
Comment on lines +38 to +43
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DOI prefix stripped and colon encoding — DOI lookups always 404

normalize_identifier removes the doi: prefix before the value is passed to the Semantic Scholar API. Semantic Scholar's paper lookup endpoint requires the DOI: prefix to recognize the identifier as a DOI (e.g., /paper/DOI:10.1234%2F5678); without it the raw value 10.1234%2F5678 is treated as an unknown paper ID and the API returns a 404.

Compounding this, quote(..., safe="") on line 124 encodes : to %3A, so even if the prefix were preserved, DOI%3A10.1234%2F5678 may not be matched by the server. The fix is to normalise the prefix to uppercase DOI: (instead of stripping it) and pass safe=":" to quote so that scheme separators are left intact.



async def api_get(path: str, params: Dict[str, Any]) -> Dict[str, Any]:
url = f"{API_BASE}{path}"
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
resp = await client.get(url, params=params)
resp.raise_for_status()
return resp.json()


@app.get("/.well-known/omi-tools.json")
async def manifest() -> Dict[str, Any]:
return {
"tools": [
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"max_results": {
"type": "integer",
"description": "Max results (1-10, default 5)",
},
"min_year": {
"type": "integer",
"description": "Optional minimum publication year",
},
},
"required": ["query"],
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {
"type": "object",
"properties": {
"paper_id_or_doi": {
"type": "string",
"description": "Semantic Scholar paper ID or DOI",
}
},
"required": ["paper_id_or_doi"],
},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {
"type": "object",
"properties": {
"author_id": {
"type": "string",
"description": "Semantic Scholar author ID",
},
"max_results": {
"type": "integer",
"description": "Max results (1-10, default 5)",
},
},
"required": ["author_id"],
},
},
Comment on lines +58 to +114
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-standard manifest parameter schema — Omi may not parse tool parameters correctly

Other plugins in this repo (e.g., omi-zomato-app) use a JSON Schema-shaped parameters object with properties, type, description, and required keys. The current manifest uses plain strings as values (e.g., "string", "integer 1-10 (default 5)"). If Omi's tool-calling layer expects JSON Schema, the LLM will not receive accurate type or constraint information for these tools.

Suggested change
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"query": "string",
"max_results": "integer 1-10 (default 5)",
"min_year": "optional integer",
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {"paper_id_or_doi": "string"},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {"author_id": "string", "max_results": "integer 1-10"},
},
{
"name": "search_semantic_scholar_papers",
"description": "Search Semantic Scholar papers by keyword.",
"endpoint": "/tools/search_semantic_scholar_papers",
"method": "POST",
"parameters": {
"properties": {
"query": {"type": "string", "description": "Keyword search query"},
"max_results": {"type": "integer", "description": "Number of results to return (1-10, default 5)"},
"min_year": {"type": "integer", "description": "Optional earliest publication year filter"},
},
"required": ["query"],
},
},
{
"name": "get_semantic_scholar_paper",
"description": "Get details for a paper by Semantic Scholar ID or DOI.",
"endpoint": "/tools/get_semantic_scholar_paper",
"method": "POST",
"parameters": {
"properties": {
"paper_id_or_doi": {"type": "string", "description": "Semantic Scholar paper ID or DOI (e.g. DOI:10.xxx/xxx)"},
},
"required": ["paper_id_or_doi"],
},
},
{
"name": "get_semantic_scholar_author_papers",
"description": "Get recent papers by Semantic Scholar author ID.",
"endpoint": "/tools/get_semantic_scholar_author_papers",
"method": "POST",
"parameters": {
"properties": {
"author_id": {"type": "string", "description": "Semantic Scholar author ID"},
"max_results": {"type": "integer", "description": "Number of papers to return (1-10, default 5)"},
},
"required": ["author_id"],
},
},

]
}


@app.post("/tools/search_semantic_scholar_papers", response_model=ChatToolResponse)
async def search_papers(req: SearchPapersRequest) -> ChatToolResponse:
params: Dict[str, Any] = {
"query": req.query,
"limit": req.max_results,
"fields": "title,year,authors,citationCount,url,venue",
}
if req.min_year:
params["year"] = f"{req.min_year}-"

try:
data = await api_get("/paper/search", params)
papers = data.get("data", [])
if not papers:
return ChatToolResponse(result="No papers found.")

lines = []
for i, paper in enumerate(papers, start=1):
title = paper.get("title") or "Untitled"
year = format_year(paper.get("year"))
authors = format_authors(paper.get("authors", []))
venue = paper.get("venue") or "Unknown venue"
cites = paper.get("citationCount", 0)
url = paper.get("url") or ""
lines.append(
f"{i}. {title}\n Authors: {authors}\n Year: {year} | Venue: {venue} | Citations: {cites}"
+ (f"\n URL: {url}" if url else "")
)
return ChatToolResponse(result="\n\n".join(lines))
except httpx.HTTPStatusError as exc:
return ChatToolResponse(error=f"Semantic Scholar API error: {exc.response.status_code}")
except Exception as exc:
return ChatToolResponse(error=f"Unexpected error: {exc}")


@app.post("/tools/get_semantic_scholar_paper", response_model=ChatToolResponse)
async def get_paper(req: GetPaperRequest) -> ChatToolResponse:
try:
identifier = quote(normalize_identifier(req.paper_id_or_doi), safe=":")
data = await api_get(
f"/paper/{identifier}",
{"fields": "title,abstract,year,authors,citationCount,referenceCount,url,venue"},
)

title = data.get("title") or "Untitled"
year = format_year(data.get("year"))
authors = format_authors(data.get("authors", []))
venue = data.get("venue") or "Unknown venue"
citations = data.get("citationCount", 0)
references = data.get("referenceCount", 0)
abstract = data.get("abstract") or "No abstract available."
url = data.get("url") or ""

result = (
f"Title: {title}\n"
f"Authors: {authors}\n"
f"Year: {year}\n"
f"Venue: {venue}\n"
f"Citations: {citations} | References: {references}\n"
f"Abstract: {abstract}"
+ (f"\nURL: {url}" if url else "")
)
return ChatToolResponse(result=result)
except httpx.HTTPStatusError as exc:
code = exc.response.status_code
if code == 404:
return ChatToolResponse(error="Paper not found.")
return ChatToolResponse(error=f"Semantic Scholar API error: {code}")
except Exception as exc:
return ChatToolResponse(error=f"Unexpected error: {exc}")


@app.post("/tools/get_semantic_scholar_author_papers", response_model=ChatToolResponse)
async def get_author_papers(req: GetAuthorPapersRequest) -> ChatToolResponse:
try:
author_id = quote(req.author_id.strip(), safe="")
data = await api_get(
f"/author/{author_id}",
{
"fields": "name,papers.title,papers.year,papers.citationCount,papers.url",
},
)

author_name = data.get("name") or req.author_id
papers = data.get("papers", [])
if not papers:
return ChatToolResponse(result=f"No papers found for author {author_name}.")

papers_sorted = sorted(
papers,
key=lambda p: ((p.get("year") or 0), (p.get("citationCount") or 0)),
reverse=True,
)[: req.max_results]

lines = [f"Recent papers by {author_name}:"]
for i, paper in enumerate(papers_sorted, start=1):
title = paper.get("title") or "Untitled"
year = format_year(paper.get("year"))
cites = paper.get("citationCount", 0)
url = paper.get("url") or ""
lines.append(
f"{i}. {title}\n Year: {year} | Citations: {cites}" + (f"\n URL: {url}" if url else "")
)

return ChatToolResponse(result="\n\n".join(lines))
except httpx.HTTPStatusError as exc:
code = exc.response.status_code
if code == 404:
return ChatToolResponse(error="Author not found.")
return ChatToolResponse(error=f"Semantic Scholar API error: {code}")
except Exception as exc:
return ChatToolResponse(error=f"Unexpected error: {exc}")


@app.get("/")
async def root() -> Dict[str, str]:
return {"message": "Semantic Scholar Omi integration is running."}
31 changes: 31 additions & 0 deletions plugins/omi-semantic-scholar-app/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
"""Pydantic models for Semantic Scholar Omi integration."""
from typing import Optional
from pydantic import BaseModel, Field
from pydantic import model_validator


class ChatToolResponse(BaseModel):
"""Response model for Omi chat tool endpoints."""
result: Optional[str] = None
error: Optional[str] = None
Comment on lines +7 to +10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ChatToolResponse allows both fields to be None simultaneously

Both result and error default to None, so it is possible to construct — or accidentally return — ChatToolResponse() with {"result": null, "error": null}. Adding a model validator (e.g., @model_validator(mode="after")) that requires exactly one of the two fields to be set would make the contract explicit and prevent silent empty responses from reaching the Omi platform.


@model_validator(mode="after")
def validate_result_or_error(self):
if self.result is None and self.error is None:
raise ValueError("Either result or error must be provided.")
return self


class SearchPapersRequest(BaseModel):
query: str = Field(..., min_length=2, max_length=200)
max_results: int = Field(default=5, ge=1, le=10)
min_year: Optional[int] = Field(default=None, ge=1800, le=2100)


class GetPaperRequest(BaseModel):
paper_id_or_doi: str = Field(..., min_length=2, max_length=200)


class GetAuthorPapersRequest(BaseModel):
author_id: str = Field(..., min_length=1, max_length=100)
max_results: int = Field(default=5, ge=1, le=10)
9 changes: 9 additions & 0 deletions plugins/omi-semantic-scholar-app/railway.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[build]
builder = "NIXPACKS"

[deploy]
startCommand = "uvicorn main:app --host 0.0.0.0 --port $PORT"
healthcheckPath = "/"
healthcheckTimeout = 100
restartPolicyType = "ON_FAILURE"
restartPolicyMaxRetries = 10
4 changes: 4 additions & 0 deletions plugins/omi-semantic-scholar-app/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
fastapi==0.104.1
uvicorn==0.24.0
httpx==0.25.2
pydantic==2.5.2