Skip to content

#25 Issue: Wikidata URI Grounding#71

Open
aliiakbarkhan wants to merge 4 commits intodbpedia:masterfrom
aliiakbarkhan:aliiakbarkhan/issue25
Open

#25 Issue: Wikidata URI Grounding#71
aliiakbarkhan wants to merge 4 commits intodbpedia:masterfrom
aliiakbarkhan:aliiakbarkhan/issue25

Conversation

@aliiakbarkhan
Copy link

@aliiakbarkhan aliiakbarkhan commented Mar 10, 2026

Wikidata URI Grounding

Current Architecture:
The chatbot grounds questions to DBpedia URIs only through this pipeline:

QANARY (QANARY.java) calls http://qanswer-core1.univ-st-etienne.fr/api/gerbil with kb=dbpedia → returns DBpedia URIs
NLHandler (NLHandler.java) processes the URIs by querying DBpedia's SPARQL endpoint
SPARQL (SPARQL.java) queries https://dbpedia.org/sparql to get entity info (label, abstract, thumbnail, etc.)

Changes I have made:

  1. QANARY.java — Add Wikidata KB query
  2. QAService.Data — Track URI source
  3. New WikidataSPARQL.java — Query Wikidata SPARQL endpoint
  4. NLHandler.java — Handle Wikidata URIs
  5. Utility.java — Add Wikidata URL helpers
  6. Ontology.java — Add Wikidata ontology constants

Summary by CodeRabbit

  • New Features

    • Added Wikidata support to surface richer entity cards (labels, descriptions, images, links).
  • Improvements

    • Search queries DBpedia first, then Wikidata, merging results and favoring DBpedia hits.
    • URIs are routed to the appropriate lookup flow; Wikidata-specific handling and option shortcuts applied.
    • Template handling now gracefully handles missing entity info.
  • Bug Fixes

    • More resilient behavior when external lookups return no data.

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

Adds Wikidata SPARQL integration: new WikidataSPARQL service, Utility helpers for URI detection and ID extraction, QANARY updated to query DBpedia then Wikidata and merge results, handlers/templates routed/short‑circuited for Wikidata URIs, plus a VSCode Java config change and two small test files.

Changes

Cohort / File(s) Summary
Editor config
\.vscode/settings.json
Added "java.compile.nullAnalysis.mode": "automatic" (IDE configuration change only).
Application init
src/main/java/chatbot/Application.java
Added WikidataSPARQL wikidataSparql to Helper, initialized in constructor, exposed via getWikidataSparql().
Utility helpers
src/main/java/chatbot/lib/Utility.java
Added isWikidataURI, isDBpediaURI, and extractWikidataEntityId for URI detection and ID extraction.
Wikidata API service
src/main/java/chatbot/lib/api/WikidataSPARQL.java
New service to build/execute Wikidata SPARQL queries, fetch label/description/image, construct ResponseData with navigation/learn-more buttons, execute queries with timeout and logging, and handle errors.
QANARY adjustments
src/main/java/chatbot/lib/api/qa/QANARY.java
makeRequest now accepts knowledgeBase; added parseResponse; new search method queries DBpedia first then Wikidata, merges/returns results, and improves error logging.
NL handler routing
src/main/java/chatbot/lib/handlers/NLHandler.java
Routes Wikidata URIs early to WikidataSPARQL and preserves existing disambiguation/DBpedia flow for non‑Wikidata URIs.
Template handling
src/main/java/chatbot/lib/handlers/TemplateHandler.java
ENTITY_INFORMATION selects WikidataSPARQL for Wikidata URIs, adds null-check on results, and falls back to no-results response when needed.
Options template
src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java
getLearnMoreOptions short‑circuits for Wikidata URIs to return default options, bypassing RDF type lookups and TV/Movie logic.
Misc / tests
test.java, test2.java
Added test.java (empty Test class with unused Jena import) and test2.java (simple main demonstrating QueryEngineHTTP usage with header and timeout).

Sequence Diagram

sequenceDiagram
    participant Client as Client/Handler
    participant QANARY as QANARY Service
    participant DBpedia as DBpedia Endpoint
    participant WikidataSvc as WikidataSPARQL
    participant WikidataEP as Wikidata Endpoint

    Client->>QANARY: search(question)
    QANARY->>DBpedia: makeRequest(question, "dbpedia")
    DBpedia-->>QANARY: results
    QANARY->>QANARY: parseResponse(dbpedia)
    alt DBpedia returned useful URIs/literals
        QANARY-->>Client: return results
    else
        QANARY->>QANARY: makeRequest(question, "wikidata")
        QANARY->>WikidataSvc: getEntityInformation(uri)
        WikidataSvc->>WikidataEP: SPARQL query (with prefixes, timeout)
        WikidataEP-->>WikidataSvc: results
        WikidataSvc-->>QANARY: ResponseData (label, desc, image, buttons)
        QANARY->>QANARY: merge results
        QANARY-->>Client: Combined QAService.Data
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding Wikidata URI grounding support. It directly corresponds to the PR's primary objective of enabling Wikidata URI handling alongside existing DBpedia grounding across multiple components (QANARY, WikidataSPARQL, NLHandler, etc.).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/chatbot/lib/api/qa/QANARY.java (1)

65-93: ⚠️ Potential issue | 🟡 Minor

Add defensive null checks for JSON parsing.

The JSON navigation at lines 70-71 assumes a specific structure but doesn't guard against missing/null nodes. If the response structure differs (e.g., empty questions array), this will throw NullPointerException or IndexOutOfBoundsException.

🛡️ Proposed fix
 private QAService.Data parseResponse(String response) throws Exception {
     QAService.Data data = new QAService.Data();
     if (response != null) {
         ObjectMapper mapper = new ObjectMapper();
         JsonNode rootNode = mapper.readTree(response);
-        JsonNode answers = mapper
-                .readTree(rootNode.findValue("questions").get(0).get("question").get("answers").getTextValue());
+        JsonNode questions = rootNode.findValue("questions");
+        if (questions == null || !questions.isArray() || questions.size() == 0) {
+            return data;
+        }
+        JsonNode questionNode = questions.get(0).get("question");
+        if (questionNode == null || questionNode.get("answers") == null) {
+            return data;
+        }
+        JsonNode answers = mapper.readTree(questionNode.get("answers").getTextValue());

         if (answers != null) {
             // ... rest unchanged
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/qa/QANARY.java` around lines 65 - 93, The
parseResponse method assumes a rigid JSON shape and can NPE/IndexOutOfBounds;
add defensive null/shape checks before traversing nodes: verify rootNode is
non-null, ensure rootNode.findValue("questions") returns a node and is an array
with size>0, confirm .get(0).get("question") and its "answers" field exist and
are non-null before calling getTextValue/readTree; similarly guard that
answers.get("results") and .get("bindings") exist and are arrays before
iterating, and skip or return empty QAService.Data if any expected node is
missing. Use JsonNode.isMissingNode()/isNull()/isArray()/size() or path() checks
around rootNode, answers, bindings and inside the binding iteration (check
value.get("type") and value.get("value") are present) to avoid exceptions.
🧹 Nitpick comments (2)
src/main/java/chatbot/lib/api/qa/QANARY.java (1)

95-110: Asymmetric error handling between DBpedia and Wikidata queries.

DBpedia query failures propagate as exceptions while Wikidata failures are caught and logged. This means a DBpedia service issue will fail the entire request even if Wikidata would have returned valid results.

Consider consistent handling:

🛡️ Proposed fix for resilient degradation
 public QAService.Data search(String question) throws Exception {
-    // Query DBpedia KB
-    QAService.Data data = parseResponse(makeRequest(question, "dbpedia"));
+    QAService.Data data = new QAService.Data();
+    
+    // Query DBpedia KB
+    try {
+        QAService.Data dbpediaData = parseResponse(makeRequest(question, "dbpedia"));
+        data.addData(dbpediaData, false);
+    } catch (Exception e) {
+        logger.error("DBpedia QANARY query failed: " + e.getMessage());
+    }

     // Query Wikidata KB and merge results
     try {
         QAService.Data wikidataData = parseResponse(makeRequest(question, "wikidata"));
         data.addData(wikidataData, false);
     } catch (Exception e) {
         logger.error("Wikidata QANARY query failed, continuing with DBpedia results only: " + e.getMessage());
     }

     return data;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/qa/QANARY.java` around lines 95 - 110, The
search method currently treats DBpedia failures as exceptions (from
parseResponse(makeRequest(question, "dbpedia"))) but swallows Wikidata errors;
make error handling symmetric by performing each KB request in its own
try/catch: call makeRequest and parseResponse for "dbpedia" inside a try block
and if it fails log the error via logger.error including exception details and
continue with an empty QAService.Data placeholder, then do the same for
"wikidata" (as with current code) and finally merge results via
data.addData(wikidataData, false) only when both parsed objects exist; reference
the search method, makeRequest, parseResponse, QAService.Data, data.addData and
logger when implementing this resilient fallback so a failure in one KB does not
abort the whole request.
src/main/java/chatbot/lib/api/WikidataSPARQL.java (1)

72-75: capitalizeAll may alter entity descriptions inappropriately.

Wikidata descriptions are typically well-formatted. Applying capitalizeAll may produce unintended results for descriptions containing proper nouns, acronyms, or technical terms (e.g., "CEO of Apple Inc." → "Ceo Of Apple Inc.").

Consider using the description as-is or only capitalizing the first letter:

 if (result.get("description") != null) {
-    responseData.setText(Utility.capitalizeAll(result.get("description").asLiteral().getString()));
+    responseData.setText(result.get("description").asLiteral().getString());
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java` around lines 72 - 75, The
code in WikidataSPARQL currently forces descriptions through
Utility.capitalizeAll which can mangle proper nouns and acronyms; change the
handling in the block that reads result.get("description") so it either assigns
the raw literal (result.get("description").asLiteral().getString()) to
responseData.setText(...) or applies a safer transform that only uppercases the
first character (implement/use a Utility.capitalizeFirst(String) helper) instead
of Utility.capitalizeAll; update the assignment inside the if that calls
responseData.setText(...) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java`:
- Around line 59-99: The code can NPE in the finally block if
executeQuery(query) throws and queryExecution stays null; update the
WikidataSPARQL logic that calls executeQuery to either (a) initialize
QueryExecution queryExecution = null and in the finally check queryExecution !=
null before calling queryExecution.close(), or (b) move the executeQuery call
inside the try so any exception prevents reaching the finally with an
uninitialized variable; ensure you reference QueryExecution,
queryExecution.close(), and the executeQuery(query) call when editing.
- Around line 106-128: getLabel currently lets exceptions from the SPARQL call
propagate; mirror the error handling used in getEntityInformation by wrapping
the query execution and result parsing in a try/catch that catches Exception,
logs a descriptive error with the entityId and exception (use the class logger
used elsewhere, e.g., LOGGER or logger), and returns null on error; also
initialize QueryExecution to null and ensure the finally block checks for null
before calling close() so resources are always cleaned up.

In `@src/main/java/chatbot/lib/handlers/TemplateHandler.java`:
- Around line 121-128: In TemplateHandler inside the
TemplateType.ENTITY_INFORMATION case, avoid adding null to the carousel by
checking the result of
helper.getWikidataSparql().getEntityInformation(payload[1]) and
helper.getSparql().getEntityInformation(payload[1]) before calling
responseGenerator.addCarouselResponse; use Utility.isWikidataURI(payload[1]) to
pick which call, store the result in a local variable (e.g., entityInfo), and
only add it to the ArrayList (or skip creating/adding the carousel) if
entityInfo != null so the carousel renderer never receives a null ResponseData.

In `@src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java`:
- Around line 87-92: The early Wikidata short-circuit in OptionsTemplateHandler
(Utility.isWikidataURI(uri)) still returns getDefaultOptions(...) which includes
"Similar" and "Related" buttons that call GenesisService.getSimilarEntities /
getRelatedEntities and ultimately pass the Wikidata URI through makeRequest;
change this so Wikidata URIs do not surface those buttons: update
getDefaultOptions (or the branch that calls
responseGenerator.addSmartReplyResponse) to detect Utility.isWikidataURI(uri)
and remove or omit the "Similar" and "Related" actions, or alternatively
implement a Wikidata-specific handler path that maps/rewrites the URI before
calling GenesisService if you prefer to keep the buttons (modify GenesisService
invocation accordingly).

In `@src/main/java/chatbot/lib/Utility.java`:
- Around line 116-123: The extractWikidataEntityId method currently returns the
last path segment indiscriminately, which allows malformed or malicious values
to flow into SPARQL in WikidataSPARQL; update extractWikidataEntityId to
validate the extracted id strictly (e.g. trim/URL-decode the last segment and
accept only a pattern like ^Q\d+$), and return null or throw an
IllegalArgumentException for anything that doesn't match so only safe entity IDs
are ever concatenated into WikidataSPARQL.

---

Outside diff comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 65-93: The parseResponse method assumes a rigid JSON shape and can
NPE/IndexOutOfBounds; add defensive null/shape checks before traversing nodes:
verify rootNode is non-null, ensure rootNode.findValue("questions") returns a
node and is an array with size>0, confirm .get(0).get("question") and its
"answers" field exist and are non-null before calling getTextValue/readTree;
similarly guard that answers.get("results") and .get("bindings") exist and are
arrays before iterating, and skip or return empty QAService.Data if any expected
node is missing. Use JsonNode.isMissingNode()/isNull()/isArray()/size() or
path() checks around rootNode, answers, bindings and inside the binding
iteration (check value.get("type") and value.get("value") are present) to avoid
exceptions.

---

Nitpick comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 95-110: The search method currently treats DBpedia failures as
exceptions (from parseResponse(makeRequest(question, "dbpedia"))) but swallows
Wikidata errors; make error handling symmetric by performing each KB request in
its own try/catch: call makeRequest and parseResponse for "dbpedia" inside a try
block and if it fails log the error via logger.error including exception details
and continue with an empty QAService.Data placeholder, then do the same for
"wikidata" (as with current code) and finally merge results via
data.addData(wikidataData, false) only when both parsed objects exist; reference
the search method, makeRequest, parseResponse, QAService.Data, data.addData and
logger when implementing this resilient fallback so a failure in one KB does not
abort the whole request.

In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java`:
- Around line 72-75: The code in WikidataSPARQL currently forces descriptions
through Utility.capitalizeAll which can mangle proper nouns and acronyms; change
the handling in the block that reads result.get("description") so it either
assigns the raw literal (result.get("description").asLiteral().getString()) to
responseData.setText(...) or applies a safer transform that only uppercases the
first character (implement/use a Utility.capitalizeFirst(String) helper) instead
of Utility.capitalizeAll; update the assignment inside the if that calls
responseData.setText(...) accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ca2785bd-8b95-4db6-a12d-a044c83ed096

📥 Commits

Reviewing files that changed from the base of the PR and between e4c2631 and bd4b34c.

📒 Files selected for processing (8)
  • .vscode/settings.json
  • src/main/java/chatbot/Application.java
  • src/main/java/chatbot/lib/Utility.java
  • src/main/java/chatbot/lib/api/WikidataSPARQL.java
  • src/main/java/chatbot/lib/api/qa/QANARY.java
  • src/main/java/chatbot/lib/handlers/NLHandler.java
  • src/main/java/chatbot/lib/handlers/TemplateHandler.java
  • src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java

Comment on lines +87 to +92

// Wikidata URIs cannot be looked up via DBpedia SPARQL for RDF types,
// so skip the TV/Movie-specific options and show defaults directly
if(Utility.isWikidataURI(uri)) {
return responseGenerator.addSmartReplyResponse(getDefaultOptions(uri, label));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if GenesisService handles Wikidata URIs or only DBpedia
rg -n -A 10 'class GenesisService' --type java
rg -n 'getSimilarEntities|getRelatedEntities' --type java -A 5

Repository: dbpedia/chatbot

Length of output: 4260


🏁 Script executed:

#!/bin/bash
# Check makeRequest implementation
rg -n 'makeRequest' src/main/java/chatbot/lib/api/dbpedia/GenesisService.java -A 20

# Check getDefaultOptions method
rg -n 'getDefaultOptions' src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java -A 15

Repository: dbpedia/chatbot

Length of output: 4772


🏁 Script executed:

#!/bin/bash
# Check how LOAD_SIMILAR and LOAD_RELATED are handled
rg -n 'LOAD_SIMILAR|LOAD_RELATED' src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java -B 2 -A 8

# Check for any Wikidata-related Genesis handling
rg -n 'Wikidata|wikidata' src/main/java/chatbot/lib/api/dbpedia/GenesisService.java

# Look for any other references to Wikidata and Genesis together
rg -n 'getSimilarEntities|getRelatedEntities' src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java -B 3 -A 3

Repository: dbpedia/chatbot

Length of output: 2657


Similar and Related buttons will pass Wikidata URIs to Genesis API without Wikidata-specific handling.

The early exit at line 90 correctly avoids DBpedia SPARQL RDF type lookups for Wikidata URIs, but getDefaultOptions() still adds "Similar" and "Related" buttons that invoke GenesisService.getSimilarEntities() and GenesisService.getRelatedEntities(). When clicked with a Wikidata URI, these buttons pass the URI directly to the Genesis API (line 39, 42) without any Wikidata-aware logic. The makeRequest() method sends the URI as-is with no differentiation between DBpedia and Wikidata sources, and test coverage only demonstrates DBpedia URIs.

Consider either:

  1. Disabling Similar/Related for Wikidata URIs
  2. Implementing Wikidata-specific alternatives
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/handlers/templates/OptionsTemplateHandler.java`
around lines 87 - 92, The early Wikidata short-circuit in OptionsTemplateHandler
(Utility.isWikidataURI(uri)) still returns getDefaultOptions(...) which includes
"Similar" and "Related" buttons that call GenesisService.getSimilarEntities /
getRelatedEntities and ultimately pass the Wikidata URI through makeRequest;
change this so Wikidata URIs do not surface those buttons: update
getDefaultOptions (or the branch that calls
responseGenerator.addSmartReplyResponse) to detect Utility.isWikidataURI(uri)
and remove or omit the "Similar" and "Related" actions, or alternatively
implement a Wikidata-specific handler path that maps/rewrites the URI before
calling GenesisService if you prefer to keep the buttons (modify GenesisService
invocation accordingly).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/chatbot/lib/api/qa/QANARY.java (1)

77-91: ⚠️ Potential issue | 🟠 Major

Handle plain "literal" bindings as well.

SPARQL JSON results can use "literal" for untyped string values. Right now those answers are ignored, so some valid QANARY results disappear during parsing.

Suggested fix
-                        switch (value.get("type").getTextValue()) {
+                        switch (value.get("type").getTextValue()) {
                             case "uri":
                                 data.addURI(value.get("value").getTextValue());
                                 break;
+                            case "literal":
                             case "typed-literal":
                                 data.addLiteral(value.get("value").getTextValue());
                                 break;
                         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/qa/QANARY.java` around lines 77 - 91, The
SPARQL result parsing in QANARY only handles "uri" and "typed-literal" types so
plain "literal" bindings are dropped; update the switch in the loop that
iterates over binding/entry/value (the code that checks
value.get("type").getTextValue()) to also handle the "literal" case by calling
data.addLiteral(value.get("value").getTextValue()) (i.e., add a case "literal":
path alongside "typed-literal"), ensuring plain string literals are captured the
same way as typed-literal values.
♻️ Duplicate comments (1)
src/main/java/chatbot/lib/api/WikidataSPARQL.java (1)

120-136: ⚠️ Potential issue | 🟠 Major

Move executeQuery(query) inside the try.

getLabel still calls executeQuery(query) before the guarded block. If Jena throws while creating the query execution, the catch/finally never runs and this method propagates the failure instead of returning null like getEntityInformation().

Suggested fix
-        QueryExecution queryExecution = executeQuery(query);
+        QueryExecution queryExecution = null;
         String label = null;
 
         try {
+            queryExecution = executeQuery(query);
             Iterator<QuerySolution> results = queryExecution.execSelect();
             if (results.hasNext()) {
                 label = results.next().get("label").asLiteral().getString();
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java` around lines 120 - 136, In
getLabel, move the call to executeQuery(query) into the try block so
QueryExecution is created under the catch/finally guard (currently
executeQuery(query) is called before try); ensure the QueryExecution local
(queryExecution) is declared outside the try, assigned inside the try, and still
closed in the finally if non-null so any exception from executeQuery or
execSelect is caught and the method returns null consistent with
getEntityInformation().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 103-117: search() is blocking because it waits for both DBpedia
and Wikidata sequentially; modify the Wikidata call to run asynchronously so
DBpedia results can be returned immediately and Wikidata results are merged
when/if they arrive. Specifically, keep the DBpedia flow using
makeRequest/parseResponse/QAService.Data and data.addData as is, but dispatch
the Wikidata request (the makeRequest(question, "wikidata") -> parseResponse ->
data.addData sequence) to a background task (e.g.,
ExecutorService/CompletableFuture) with a bounded timeout and non-blocking error
handling so the search() method does not wait for Wikidata to complete; ensure
the background task logs errors and merges results into the same data object
when completed.

In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java`:
- Around line 146-149: The code currently uses QueryEngineHTTP via
QueryExecutionFactory.sparqlService and sets only query params (addParam), which
cannot set the required HTTP "User-Agent" header for Wikidata; replace the
QueryEngineHTTP usage with the newer QueryExecutionHTTP (or construct a
QueryExecution via QueryExecutionHTTP) and call its httpHeader("User-Agent",
"<your-descriptive-agent-with-contact>") before executing the query (keep
existing timeout/format params if needed), ensuring the User-Agent string
includes contact info as required by Wikidata.

In `@src/main/java/chatbot/lib/Utility.java`:
- Around line 108-114: The helpers isWikidataURI and isDBpediaURI only match
http:// URIs so HTTPS variants are missed; update both methods to accept either
scheme by checking uri != null and
(uri.startsWith("http://www.wikidata.org/entity/") ||
uri.startsWith("https://www.wikidata.org/entity/")) for isWikidataURI and
similarly for isDBpediaURI with "http(s)://dbpedia.org/resource/"; keep the null
check and return the boolean expression directly so callers like TemplateHandler
will correctly detect HTTPS resource URIs.

---

Outside diff comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 77-91: The SPARQL result parsing in QANARY only handles "uri" and
"typed-literal" types so plain "literal" bindings are dropped; update the switch
in the loop that iterates over binding/entry/value (the code that checks
value.get("type").getTextValue()) to also handle the "literal" case by calling
data.addLiteral(value.get("value").getTextValue()) (i.e., add a case "literal":
path alongside "typed-literal"), ensuring plain string literals are captured the
same way as typed-literal values.

---

Duplicate comments:
In `@src/main/java/chatbot/lib/api/WikidataSPARQL.java`:
- Around line 120-136: In getLabel, move the call to executeQuery(query) into
the try block so QueryExecution is created under the catch/finally guard
(currently executeQuery(query) is called before try); ensure the QueryExecution
local (queryExecution) is declared outside the try, assigned inside the try, and
still closed in the finally if non-null so any exception from executeQuery or
execSelect is caught and the method returns null consistent with
getEntityInformation().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 56b2e47f-18a4-4284-8cef-6df2bf628a76

📥 Commits

Reviewing files that changed from the base of the PR and between bd4b34c and 3d7011b.

📒 Files selected for processing (4)
  • src/main/java/chatbot/lib/Utility.java
  • src/main/java/chatbot/lib/api/WikidataSPARQL.java
  • src/main/java/chatbot/lib/api/qa/QANARY.java
  • src/main/java/chatbot/lib/handlers/TemplateHandler.java

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/chatbot/lib/api/qa/QANARY.java (1)

77-91: ⚠️ Potential issue | 🟡 Minor

Guard the nested results.bindings access before iterating.

At Line 78, the parser still assumes answers.results.bindings and each binding’s type/value nodes are always present. A partial or error-shaped QAnswer payload will still throw here instead of cleanly returning an empty QAService.Data.

Proposed hardening
-            if (answers != null) {
-                JsonNode bindings = answers.get("results").get("bindings");
-                for (JsonNode binding : bindings) {
+            if (answers != null) {
+                JsonNode results = answers.get("results");
+                JsonNode bindings = results == null ? null : results.get("bindings");
+                if (bindings == null || !bindings.isArray()) {
+                    return data;
+                }
+                for (JsonNode binding : bindings) {
                     Iterator<Map.Entry<String, JsonNode>> nodes = binding.getFields();
                     while (nodes.hasNext()) {
                         Map.Entry<String, JsonNode> entry = nodes.next();
                         JsonNode value = entry.getValue();
-                        switch (value.get("type").getTextValue()) {
+                        JsonNode typeNode = value == null ? null : value.get("type");
+                        JsonNode valueNode = value == null ? null : value.get("value");
+                        if (typeNode == null || valueNode == null) {
+                            continue;
+                        }
+                        switch (typeNode.getTextValue()) {
                             case "uri":
-                                data.addURI(value.get("value").getTextValue());
+                                data.addURI(valueNode.getTextValue());
                                 break;
                             case "typed-literal":
-                                data.addLiteral(value.get("value").getTextValue());
+                                data.addLiteral(valueNode.getTextValue());
                                 break;
                         }
                     }
                 }
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/qa/QANARY.java` around lines 77 - 91, The code
in QANARY assumes answers.get("results").get("bindings") and each binding's
"type"/"value" exist; guard these accesses by first checking answers != null and
answers.has("results") and that answers.get("results").has("bindings") and that
bindings is an array (bindings.isArray()) before looping; inside the loop, check
each binding hasNonNull("type") and hasNonNull("value") (or use
binding.path(...).isMissingNode()/isNull checks) before calling
getTextValue()/asText() and only handle "uri" and "typed-literal" when both
fields are present; if any of those checks fail, skip that binding and
ultimately return an empty QAService.Data as before.
🧹 Nitpick comments (1)
test.java (1)

1-2: Drop the placeholder Test class or turn it into a real test.

This file currently adds no coverage for the Wikidata/SPARQL path and is easy to leave behind as dead code. If it is meant to validate the new integration, please move it under the test sources and add an actual assertion-based test; otherwise I’d remove it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test.java` around lines 1 - 2, The placeholder class Test should be removed
or replaced with a real test: either delete the class Test from production
sources to avoid dead code, or move it to the test sources and implement an
assertion-based unit/integration test that exercises the Wikidata/SPARQL path
(use QueryExecutionHTTP in your test to run a sample query and assert expected
results), annotate the class/methods with your test framework (e.g., JUnit
`@Test`) and give the class a descriptive name (e.g.,
WikidataSparqlIntegrationTest) so it provides actual coverage and cannot be
accidentally left behind.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 111-115: The current early-return in QANARY.java uses
data.getUris() || data.getLiterals(), which causes a DBpedia literal to
short-circuit and skip the Wikidata lookup; change the logic to only
short-circuit when data.getUris() is non-empty (i.e., only return early if
data.getUris().isEmpty() is false) so that a DBpedia literal does not prevent
running the Wikidata enrichment; update the condition around the return
(referencing data.getUris() and data.getLiterals() and the surrounding block) so
literals do not bypass the Wikidata path.

---

Outside diff comments:
In `@src/main/java/chatbot/lib/api/qa/QANARY.java`:
- Around line 77-91: The code in QANARY assumes
answers.get("results").get("bindings") and each binding's "type"/"value" exist;
guard these accesses by first checking answers != null and
answers.has("results") and that answers.get("results").has("bindings") and that
bindings is an array (bindings.isArray()) before looping; inside the loop, check
each binding hasNonNull("type") and hasNonNull("value") (or use
binding.path(...).isMissingNode()/isNull checks) before calling
getTextValue()/asText() and only handle "uri" and "typed-literal" when both
fields are present; if any of those checks fail, skip that binding and
ultimately return an empty QAService.Data as before.

---

Nitpick comments:
In `@test.java`:
- Around line 1-2: The placeholder class Test should be removed or replaced with
a real test: either delete the class Test from production sources to avoid dead
code, or move it to the test sources and implement an assertion-based
unit/integration test that exercises the Wikidata/SPARQL path (use
QueryExecutionHTTP in your test to run a sample query and assert expected
results), annotate the class/methods with your test framework (e.g., JUnit
`@Test`) and give the class a descriptive name (e.g.,
WikidataSparqlIntegrationTest) so it provides actual coverage and cannot be
accidentally left behind.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9c361229-a62c-4599-a6b7-d351253d22d6

📥 Commits

Reviewing files that changed from the base of the PR and between 3d7011b and ae9b7a9.

📒 Files selected for processing (2)
  • src/main/java/chatbot/lib/api/qa/QANARY.java
  • test.java

Comment on lines +111 to +115
// If DBpedia yielded an answer, return early so we don't pay
// the extra latency waiting for Wikidata.
if (!data.getUris().isEmpty() || !data.getLiterals().isEmpty()) {
return data;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t short-circuit on DBpedia literals.

At Line 113, returning when literals is non-empty skips the Wikidata lookup even though the downstream grounding path only becomes useful when uris are present. In src/main/java/chatbot/lib/api/qa/QAService.java:39-48, WolframAlpha can overwrite QANARY literals later, and in src/main/java/chatbot/lib/handlers/NLHandler.java:94-147 literal responses bypass URI enrichment entirely. A DBpedia literal here can therefore suppress the Wikidata URI that this PR is trying to add.

Minimal fix
-        if (!data.getUris().isEmpty() || !data.getLiterals().isEmpty()) {
+        if (!data.getUris().isEmpty()) {
             return data;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// If DBpedia yielded an answer, return early so we don't pay
// the extra latency waiting for Wikidata.
if (!data.getUris().isEmpty() || !data.getLiterals().isEmpty()) {
return data;
}
// If DBpedia yielded an answer, return early so we don't pay
// the extra latency waiting for Wikidata.
if (!data.getUris().isEmpty()) {
return data;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/chatbot/lib/api/qa/QANARY.java` around lines 111 - 115, The
current early-return in QANARY.java uses data.getUris() || data.getLiterals(),
which causes a DBpedia literal to short-circuit and skip the Wikidata lookup;
change the logic to only short-circuit when data.getUris() is non-empty (i.e.,
only return early if data.getUris().isEmpty() is false) so that a DBpedia
literal does not prevent running the Wikidata enrichment; update the condition
around the return (referencing data.getUris() and data.getLiterals() and the
surrounding block) so literals do not bypass the Wikidata path.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test2.java (1)

1-9: This file appears unrelated to the PR's Wikidata grounding feature.

Based on the PR objectives, this PR adds Wikidata URI grounding support. This test2.java file is a standalone scratch file that doesn't test or relate to the new functionality (WikidataSPARQL, NLHandler changes, etc.). Consider removing this file from the PR or clarifying its purpose.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test2.java` around lines 1 - 9, test2.java is an unrelated scratch file
(class Test with main using QueryEngineHTTP) that does not pertain to the
Wikidata grounding feature; either remove test2.java from the PR or move it to a
dedicated experimental/tests folder and add a short comment or test harness
explaining its purpose and relation to the project (or update PR description to
justify inclusion). Ensure references to the artifact: the file name test2.java
and the Test class / main method are addressed when deleting or relocating so
the PR only contains files relevant to WikidataSPARQL and NLHandler changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test2.java`:
- Line 3: The public class declaration "public class Test" doesn't match the
filename test2.java; rename the class to "Test2" or rename the file to
"Test.java" so they match; if you choose to rename the class (symbol: Test →
Test2) update the class declaration, any constructors, and all references/usages
in the codebase and tests accordingly, or if you rename the file ensure build
configs/reference paths are updated to use Test.java.

---

Nitpick comments:
In `@test2.java`:
- Around line 1-9: test2.java is an unrelated scratch file (class Test with main
using QueryEngineHTTP) that does not pertain to the Wikidata grounding feature;
either remove test2.java from the PR or move it to a dedicated
experimental/tests folder and add a short comment or test harness explaining its
purpose and relation to the project (or update PR description to justify
inclusion). Ensure references to the artifact: the file name test2.java and the
Test class / main method are addressed when deleting or relocating so the PR
only contains files relevant to WikidataSPARQL and NLHandler changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3bb1f1c-91c5-4d9a-9c7f-068e6994a441

📥 Commits

Reviewing files that changed from the base of the PR and between ae9b7a9 and 70850be.

📒 Files selected for processing (2)
  • src/main/java/chatbot/lib/Utility.java
  • test2.java

@@ -0,0 +1,9 @@
import org.apache.jena.sparql.engine.http.QueryEngineHTTP;

public class Test {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Class name does not match filename — compilation will fail.

The file is named test2.java but contains public class Test. In Java, a public class must reside in a file with a matching name. Either rename the file to Test.java or rename the class to Test2.

🐛 Proposed fix
-public class Test {
+public class Test2 {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public class Test {
public class Test2 {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test2.java` at line 3, The public class declaration "public class Test"
doesn't match the filename test2.java; rename the class to "Test2" or rename the
file to "Test.java" so they match; if you choose to rename the class (symbol:
Test → Test2) update the class declaration, any constructors, and all
references/usages in the codebase and tests accordingly, or if you rename the
file ensure build configs/reference paths are updated to use Test.java.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant