diff --git a/api/index.md b/api/index.md index f4490e8f5..7c0761070 100644 --- a/api/index.md +++ b/api/index.md @@ -43,6 +43,7 @@ Finally, an R client library is available from a third-party provider, [Tidy Int A key is required for APIs to authenticate and authorize requests, as follows: - All REST [V2](rest/v2/index.md) APIs. These requests are served by endpoints at `api.datacommons.org`. - [Python and Pandas V2](python/v2/index.md) APIs, also served by `api.datacommons.org`. +- Data Commons MCP server requests. These are served by `api.datacommons.org/mcp`. - All requests coming from a custom Data Commons instance. These are also served by `api.datacommons.org`. - Data Commons NL API requests (used by the [DataGemma](https://ai.google.devgit/gemma/docs/datagemma){: target="_blank"} tool). These are served by endpoints at `nl.datacommons.org`. diff --git a/assets/images/mcp.png b/assets/images/mcp.png deleted file mode 100644 index f83eb482b..000000000 Binary files a/assets/images/mcp.png and /dev/null differ diff --git a/assets/images/mcp1.png b/assets/images/mcp1.png new file mode 100644 index 000000000..dc848d3e9 Binary files /dev/null and b/assets/images/mcp1.png differ diff --git a/assets/images/mcp2.png b/assets/images/mcp2.png new file mode 100644 index 000000000..7a90c7178 Binary files /dev/null and b/assets/images/mcp2.png differ diff --git a/custom_dc/deploy_mcp_cloud.md b/custom_dc/run_mcp_tools.md similarity index 72% rename from custom_dc/deploy_mcp_cloud.md rename to custom_dc/run_mcp_tools.md index bc7a69ec0..1a7d3fb42 100644 --- a/custom_dc/deploy_mcp_cloud.md +++ b/custom_dc/run_mcp_tools.md @@ -1,22 +1,56 @@ --- layout: default -title: Run an MCP server in Google Cloud +title: Run an MCP server nav_order: 9 parent: Build your own Data Commons --- {:.no_toc} -# Run an MCP server in Google Cloud +# Run an MCP server + +To use Data Commons MCP tools with a Custom Data Commons, you must run your own instance of the Data Commons MCP server. This page describes how to run a server locally and in Google Cloud. + +> **Important**: +> If you have not rebuilt your Data Commons image since the stable release of 2026-01-29, you must [sync to the latest stable release](/custom_dc/build_image.html#sync-code-to-the-stable-branch), [rebuild your image](/custom_dc/build_image.html#build-package) and [redeploy](/custom_dc/deploy_cloud.html#manage-your-service). + +* TOC +{:toc} + +## Run a local MCP server + +You can use any AI agent to spawn a local MCP server as a subprocess, or start a standalone server and connect to it from any client. For the most part, the procedures to do so are the same as those provided in [Run your own MCP server](/mcp/run_tools.html#self-hosted). The main difference is that you must set additional environment variables, as described below. + +### Prerequisites + +- Install `uv` for managing and installing Python packages; see the instructions at {: target="_blank"}. + +### Configure environment variables + +To run against a Custom Data Commons instance, you must set the following required variables: +- DC_API_KEY="YOUR API KEY" +- `DC_TYPE="custom"` +- CUSTOM_DC_URL="YOUR_INSTANCE_URL" + +Various other optional variables are also available; all are documented in [packages/datacommons-mcp/.env.sample](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/.env.sample){: target="_blank"}. + +You can set variables in the following ways: +1. In a shell/startup script (e.g. `.bashrc`). +1. Use an `.env` file. This is useful if you're setting multiple variables, to keep all settings in one place. Copy the contents of [`.env.sample`](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/.env.sample){: target="_blank"} into a file called `.env` in the directory where you plan to run the server and/or agent. +1. If you are using Gemini CLI (not the extension), you can use the `env` option in the `settings.json` file. + + + +## Run the MCP Server in Google Cloud Platform If you have built a custom agent or Gemini CLI extension which you want to make publicly available, this page describes how to run the [Data Commons MCP server](https://pypi.org/project/datacommons-mcp/) in the cloud, using Google Cloud Run. Since setting up an MCP server is a simple, one-time setup, there's no need to use Terraform to manage it. Data Commons provides a prebuilt Docker image in the Artifact Registry, so you only need to set up a new Cloud Run service to point to it. -## Prebuilt images +### Prebuilt images There are several versions of the image available, viewable at . We recommend that you choose a production version with a specific version number, to ensure that changes introduced by the Data Commons team don't break your application. -## Before you start: decide on a hosting model +### Before you start: decide on a hosting model There are several ways you can host the MCP server in Cloud Run, namely: @@ -25,13 +59,13 @@ There are several ways you can host the MCP server in Cloud Run, namely: In this page, we provide steps for running the Data Commons MCP server as a standalone container. If you want to go with the sidecar option, please see [Deploying multiple containers to a service (sidecars)](https://docs.cloud.google.com/run/docs/deploying#sidecars){: target="_blank"} for additional requirements and setup procedures. -## Prerequisites +### Prerequisites The following procedures assume that you have set up the following Google Cloud Platform services, using the [Terraform scripts](deploy_cloud.md#terraform): - A service account and roles. - A Google Cloud Secret Manager secret for storing your Data Commons API key. -## Create a Cloud Run Service for the MCP server +### Create a Cloud Run Service for the MCP server The following procedure sets up a bare-bones container service. To set additional options, such as request timeouts, instance replication, etc., please see [Configure Cloud Run services](https://docs.cloud.google.com/run/docs/configuring){: target="_blank"} for details. @@ -89,17 +123,16 @@ The following procedure sets up a bare-bones container service. To set additiona -## Connect to the server from a remote client +### Connect to the server from a remote client -For details, see the following pages: -- [Connect to the server from a local Gemini CLI client](/mcp/run_tools.html#gemini-cli-remote) -- [Connect to the server from a local agent](/mcp/run_tools.html#remote) +For details, see [Configure an agent to connect to the running server](/mcp/run_tools.html#standalone-client). The HTTP URL parameter is the Cloud Run App URL, if you are exposing the service directly, or a custom domain URL if you are using a load balancer and domain mapping. -## Troubleshoot deployment issues +### Troubleshoot deployment issues -### Container fails to start +{:.no_toc} +#### Container fails to start If you see this error message: diff --git a/mcp/develop_agent.md b/mcp/develop_agent.md deleted file mode 100644 index f4e93a1a0..000000000 --- a/mcp/develop_agent.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -layout: default -title: Develop a custom agent -nav_order: 3 -parent: MCP - Query data interactively with an AI agent ---- - -# Develop your own agent - -This page shows you how to develop a custom Data Commons agent, using two approaches: - -- Write a [custom Gemini CLI extension]() - - Simple to set up, no code required - - Minimal customization possible, mostly LLM prompts - - Requires Gemini CLI as the client - -- Write a [custom Google ADK agent](#customize-the-sample-agent) - - Some code required - - Any customization possible - - Provides a UI client as part of the framework - -## Create a custom Gemini CLI extension - -Before you start, be sure you have installed the [required prerequisites](/mcp/run_tools.html#extension). - -### Create the extension - -To create your own Data Commons Gemini CLI extension: - -1. From the directory in which you want to create the extension, run the following command: -
-   gemini extensions new EXTENSION_NAME
-   
- The extension name can be whatever you want; however, it must not collide with an existing extension name, so do not use `datacommons`. Gemini will create a subdirectory with the same name, with a skeleton configuration file `gemini-extension.json`. -1. Switch to the subdirectory that has been created: -
-   cd EXTENSION_NAME
-   
-1. Create a new Markdown file (with a `.md` suffix). You can name it however you want, or just use the default, `GEMINI.md`. -1. Write natural-language prompts to specify how Gemini should handle user queries and tool results. See for a good example to get you started. Also see the Google ADK page on [LLM agent instructions](https://google.github.io/adk-docs/agents/llm-agents/#guiding-the-agent-instructions-instruction){: target="_blank"} for tips on how to write good prompts. -1. Modify `gemini-extension.json` to add the following configuration: -
-    {
-        "name": "EXTENSION_NAME",
-        "version": "1.0.0",
-        "description": "EXTENSION_DESCRIPTION",
-        // Only needed if the file name is not GEMINI.md
-        "contextFileName": "MARKDOWN_FILE_NAME",
-        "mcpServers": {
-            "datacommons-mcp": {
-                "command": "uvx",
-                "args": [
-                    "datacommons-mcp@latest",
-                    "serve",
-                    "stdio",
-                    "--skip-api-key-validation"
-                ],
-                "env": {
-                    "DC_API_KEY": "YOUR_DATA_COMMONS_API_KEY"
-                    // Set these if you are running against a Custom Data Commons instance
-                    "DC_TYPE="custom",
-	                "CUSTOM_DC_URL"="INSTANCE_URL"
-               }
-            }
-        }
-    }
-    
- The extension name is the one you created in step 1. In the `description` field, provide a brief description of your extension. If you release the extension publicly, this description will show up on . - - For additional options, see the [Gemini CLI extension documentation](https://geminicli.com/docs/extensions/#how-it-works){: target="_blank"}. -1. Run the following command to install your new extension locally: - ``` - gemini extensions link . - ``` - -### Run the extension locally - -1. From any directory, run `gemini`. -1. In the input box, enter `/extensions list` to verify that your extension is active. -1. Optionally, if you have already installed the Data Commons extension but do not want to use it, exit Gemini and from the command line, run: - ``` - gemini extensions disable datacommons - ``` -1. Restart `gemini`. -1. If you want to verify that `datacommons` is disabled, run `/extensions list` again. -1. Start sending queries! - -### Make your extension public - -If you would like to release your extension publicly for others to use, we recommend using a Github repository. See the [Gemini CLI extension release documentation](https://geminicli.com/docs/extensions/extension-releasing/){: target="_blank"} for full details. - - -## Customize the sample agent - -We provide two sample Google Agent Development Kit-based agents you can use as inspiration for building your own agent: - -- [Try Data Commons MCP Tools with a Custom Agent](https://github.com/datacommonsorg/agent-toolkit/blob/main/notebooks/datacommons_mcp_tools_with_custom_agent.ipynb) is a Google Colab tutorial that shows how to build an ADK Python agent step by step. -- The sample [basic agent](https://github.com/datacommonsorg/agent-toolkit/tree/main/packages/datacommons-mcp/examples/sample_agents/basic_agent) is a simple Python [Google ADK](https://google.github.io/adk-docs/) agent you can use to develop locally. You can make changes directly to the Python files. You'll need to [restart the agent](/mcp/run_tools.html#use-the-sample-agent) any time you make changes. - -> Tip: You do not need to install the Google ADK; when you use the [command we provide](run_tools.md#use-the-sample-agent) to start the agent, it downloads the ADK dependencies at run time. - -### Customize the model - -To change to a different LLM or model version, edit the `AGENT_MODEL` constant in [packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py#L23){: target="_blank"}. - -### Customize agent behavior - -The agent's behavior is determined by prompts provided in the `AGENT_INSTRUCTIONS` in [packages/datacommons-mcp/examples/sample_agents/basic_agent/instructions.py](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/instructions.py){: target="_blank"}. - -You can add your own prompts to modify how the agent handles tool results. See the Google ADK page on [LLM agent instructions](https://google.github.io/adk-docs/agents/llm-agents/#guiding-the-agent-instructions-instruction){: target="_blank"} for tips on how to write good prompts. - diff --git a/mcp/index.md b/mcp/index.md index 4d169254e..30c78285d 100644 --- a/mcp/index.md +++ b/mcp/index.md @@ -15,30 +15,38 @@ has_children: true The Data Commons [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) service gives AI agents access to the Data Commons knowledge graph and returns data related to statistical variables, topics, and observations. It allows end users to formulate complex natural-language queries interactively, get data in textual, structured or unstructured formats, and download the data as desired. For example, depending on the agent, a user can answer high-level questions such as "give me the economic indicators of the BRICS countries", view simple tables, and download a CSV file of the data in tabular format. -The MCP server returns data from datacommons.org by default or can be configured to query a Custom Data Commons instance. +The MCP server returns data from datacommons.org ("base") by default. It can also be configured to query a Custom Data Commons instance. -The server is a Python binary based on the [FastMCP 2.0 framework](https://gofastmcp.com). A prebuilt package is available at . +For base Data Commons, the server is available as a hosted managed deployment to which you can connect from any AI agent running locally or remotely. -At this time, there is no centrally deployed server; you run your own server, and any client you want to connect to it. +![base Data Commons](/assets/images/mcp1.png) -![alt text](/assets/images/mcp.png) +You can also run your own MCP server locally, or in Google Cloud Platform. If you want to use the server to query a Custom Data Commons instance, you _must_ run your own. The server is available as: +- A prebuilt Python package for running locally +- A prebuilt Docker image for running in a Google Cloud Run service -You can run the server and client locally, or you can run the server and client on different machines. +![base or Custom Data Commons](/assets/images/mcp2.png) ## Tools The server currently supports the following tools: -- `search_indicators`: Searches for available variables and/or topics (a hierarchy of sub-topics and member variables) for a given place or metric. -- `get_observations`: Fetches statistical data for a given variable and place. +- `search_indicators`: Searches for available variables and/or topics (a hierarchy of sub-topics and member variables) for a given place or metric. This allows queries like: + - "Tell me what data you have about health in Egypt." + - "Do you have GDP data for Eastern European countries?" + - "What census data do you have for the U.S.?" +- `get_observations`: Fetches statistical data for a given variable and place. This allows queries like: + - "List the population of Canada since 1964." + - "Rank-order the GDP for all countries in Eastern Europe." + - "Compare the life expectancy between different countries in South America." ## Clients To connect to the Data Commons MCP Server, you can use any available AI application that supports MCP, or your own custom agent. -The server supports both standard MCP [transport protocols](https://modelcontextprotocol.io/docs/learn/architecture#transport-layer): -- Stdio: For clients that connect directly using local processes +For self-hosted deployments, the server supports both standard MCP [transport protocols](https://modelcontextprotocol.io/docs/learn/architecture#transport-layer): - Streamable HTTP: For clients that connect remotely or otherwise require HTTP (e.g. Typescript) +- Stdio: For clients that connect directly using local processes See [Run MCP tools](run_tools.md) for procedures for using [Gemini CLI](https://github.com/google-gemini/gemini-cli) and the [Gemini CLI Data Commons extension](https://geminicli.com/extensions/). diff --git a/mcp/run_tools.md b/mcp/run_tools.md index fb6b3d5bc..7399612ed 100644 --- a/mcp/run_tools.md +++ b/mcp/run_tools.md @@ -1,30 +1,32 @@ --- layout: default -title: Run MCP tools +title: Use MCP tools nav_order: 2 parent: MCP - Query data interactively with an AI agent --- {:.no_toc} -# Run MCP tools +# Use MCP tools -This page shows you how to run a local agent and connect to a Data Commons MCP server running locally or remotely. +This page describes how to run a local agent and connect to a Data Commons MCP server to query datacommons.org, using the centrally hosted server at `https://api.datacommons.org/mcp`. + +For advanced use cases, this page also describes how to run your own local server and connect to it from an agent. + +For procedures for Custom Data Commons instances, please see instead [Run an MCP server](/custom_dc/run_mcp_tools.html). * TOC {:toc} -We provide specific instructions for the following agents. All may be used to query datacommons.org or a [Custom Data Commons instance](/custom_dc). +We provide specific instructions for the following agents. - [Gemini CLI extension](#extension) - Best for querying datacommons.org - Provides a built-in "agent" and context file for Data Commons - Downloads extension files locally - - Uses `uv` to run the MCP server locally - Minimal setup - [Gemini CLI](#use-gemini-cli) - No additional downloads - - MCP server can be run locally or remotely - You can create your own LLM context file - Minimal setup @@ -32,33 +34,21 @@ We provide specific instructions for the following agents. All may be used to qu - Best for interacting with a Web GUI - Can be used to run other LLMs and prompts - Downloads agent code locally - - Server may be run remotely - Some additional setup -For an end-to-end tutorial using a server and agent over HTTP, see the sample Data Commons Colab notebook, [Try Data Commons MCP Tools with a Custom Agent](https://github.com/datacommonsorg/agent-toolkit/blob/main/notebooks/datacommons_mcp_tools_with_custom_agent.ipynb){: target="_blank"}. +For an end-to-end tutorial using a locally running server and agent over HTTP, see the sample Data Commons Colab notebook, [Try Data Commons MCP Tools with a Custom Agent](https://github.com/datacommonsorg/agent-toolkit/blob/main/notebooks/datacommons_mcp_tools_with_custom_agent.ipynb){: target="_blank"}. -For other clients/agents, see the relevant documentation; you should be able to reuse the commands and arguments detailed below. +For other clients/agents, see the relevant documentation; you should be able to easily adapt the configurations detailed here. ## Prerequisites -These are required for all agents: +This is required for all agents, regardless of the server deployment: - A (free) Data Commons API key. To obtain an API key, go to {: target="_blank"} and request a key for the `api.datacommons.org` domain. -- Install `uv` for managing and installing Python packages; see the instructions at {: target="_blank"}. Other requirements for specific agents are given in their respective sections. -> **Important**: Additionally, for Custom Data Commons instances: -> If you have not rebuilt your Data Commons image since the stable release of 2025-09-08, you must [sync to the latest stable release](/custom_dc/build_image.html#sync-code-to-the-stable-branch), [rebuild your image](/custom_dc/build_image.html#build-package) and [redeploy](/custom_dc/deploy_cloud.html#manage-your-service). - -## Configure environment variables - -You can set these in the following ways: -1. In your shell/startup script (e.g. `.bashrc`). This is the recommended option for most use cases. -1. [Use an `.env` file](#env), which the server locates automatically. This is useful for Custom Data Commons with multiple options, to keep all settings in one place. -1. If you are using Gemini CLI (not the extension), you can use the `env` option in the [`settings.json` file](#gemini). - -### Base Data Commons (datacommons.org) +### Configure environment variable For basic usage against datacommons.org, set the required `DC_API_KEY` in your shell/startup script (e.g. `.bashrc`). @@ -79,41 +69,14 @@ For basic usage against datacommons.org, set the required `DC_API_KEY` in your s -### Custom Data Commons - -To run against a Custom Data Commons instance, you must set additional variables. All supported options are documented in [packages/datacommons-mcp/.env.sample](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/.env.sample){: target="_blank"}. - -The following variables are required: -- DC_API_KEY="YOUR API KEY" -- `DC_TYPE="custom"` -- CUSTOM_DC_URL="YOUR_INSTANCE_URL" - -You can also set additional variables as described in the `.env.sample` file. - -{: #env} -{: .no_toc} -#### Set variables with an `.env` file: - -1. From Github, download the file [`.env.sample`](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/.env.sample){: target="_blank"} to the desired directory. Alternatively, if you plan to run the sample agent, clone the repo {: target="_blank"}. - -1. From the directory where you saved the sample file, copy it to a new file called `.env`. For example: - ```bash - cd ~/agent-toolkit/packages/datacommons-mcp - cp .env.sample .env - ``` -1. Set the following required variables, without quotes: - - `DC_API_KEY`: Set to your Data Commons API key - - `DC_TYPE`: Set to `custom`. - - `CUSTOM_DC_URL`: Uncomment and set to the URL of your instance. -1. Optionally, set other variables. -1. Save the file. +> **Tip:** If you are using Gemini CLI (not the extension), you can skip this step and specify the key in the Gemini CLI configuration file. {: #extension} ## Use the Gemini CLI extension **Additional prerequisites** -In addition to the [standard prerequisites](#prerequisites), you must have the following installed: +In addition to the Data Commons API key, you must install the following: - [Git](https://git-scm.com/){: target="_blank"} - [Google Gemini CLI](https://geminicli.com/docs/get-started/installation/){: target="_blank"} @@ -185,63 +148,33 @@ gemini extensions uninstall datacommons ## Use Gemini CLI -In addition to the [standard prerequisites](#prerequisites), you must have the following installed: +In addition to the Data Commons API key, you must install the following: - [Google Gemini CLI](https://geminicli.com/docs/get-started/installation/){: target="_blank"} {:.no_toc} -{: #gemini} -### Configure to run a local server - -To configure Gemini CLI to recognize the Data Commons server, edit the relevant `settings.json` file (e.g. `~/.gemini/settings.json`) to add the following: +### Configure +To configure Gemini CLI to connect to the Data Commons server, edit the relevant `settings.json` file (e.g. `~/.gemini/settings.json`) to add the following:
 {
    // ...
-    "mcpServers": {
-       "datacommons-mcp": {
-           "command": "uvx",
-            "args": [
-                "datacommons-mcp@latest",
-                "serve",
-                "stdio"
-            ],
-            "env": {
-                "DC_API_KEY": "YOUR_DATA_COMMONS_API_KEY"
-                // If you are using a Google API key
-                "GEMINI_API_KEY": "YOUR_GOOGLE_API_KEY",
-
-                // Only use these to run against a Custom Data Commons instance
-                "DC_TYPE": "custom",
-                "CUSTOM_DC_URL": "INSTANCE_URL"
-            },
-            "trust": true
-        }
-    }
-   // ...
-}
-
- -{:.no_toc} -### Configure to connect to a remote server {#gemini-cli-remote} - -1. Start up the MCP server in standalone mode, as described in [Run a standalone server](#run-a-standalone-server). -1. In the `settings.json` file, replace the `datacommons-mcp` specification as follows: -
-   {
-    "mcpServers": {
-        "datacommons-mcp": {
-            "httpUrl": "http://HOST:PORT/mcp",
-            "headers": {
-               "Content-Type": "application/json",
-               "Accept": "application/json, text/event-stream"
-            },
-            // other settings as above
+   "mcpServers": {
+     "datacommons-mcp": {
+         "httpUrl": "https://api.datacommons.org/mcp",
+         "headers": {
+           // If you have set the key in your environment
+           "X-API-Key": "$DC_API_KEY"
+            // If you have not set the key in your environment
+           "X-API-Key": "YOUR DC API KEY"
          }
       }
    }
-   
+ // ... +} + {:.no_toc} +{: #run-gemini} ### Run 1. From any directory, run `gemini`. @@ -250,16 +183,15 @@ To configure Gemini CLI to recognize the Data Commons server, edit the relevant > **Tip**: To ensure that Gemini CLI uses the Data Commons MCP tools, and not its own `GoogleSearch` tool, include a prompt to use Data Commons in your query. For example, use a query like "Use Data Commons tools to answer the following: ..." You can also add such a prompt to a [`GEMINI.md` file](https://codelabs.developers.google.com/gemini-cli-hands-on#9){: target="_blank"} so that it's persisted across sessions. -## Use the sample agent - -We provide a basic agent for interacting with the MCP Server in [packages/datacommons-mcp/examples/sample_agents/basic_agent](https://github.com/datacommonsorg/agent-toolkit/tree/main/packages/datacommons-mcp/examples/sample_agents/basic_agent){: target="_blank"}. +## Use the sample agent **Additional prerequisites** -In addition to the [standard prerequisites](#prerequisites), you will need: -- A GCP project and a Google AI API key. For details on supported keys, see {: target="_blank"}. +In addition to the Data Commons API key, you will need: - [Git](https://git-scm.com/){: target="_blank"} installed. +> Tip: You do not need to install the Google ADK; when you use the [command we provide](#run-sample) to start the agent, it downloads the ADK dependencies at run time. + {:.no_toc} ### Install @@ -269,10 +201,9 @@ git clone https://github.com/datacommonsorg/agent-toolkit.git ``` {:.no_toc} +{: #run-sample} ### Run -By default, the agent will spawn a local server and connect to it over Stdio. If you want to connect to a remote server, modify the code as described in [Connect to a remote server](#remote) before using this procedure. - 1. Go to the root directory of the repo: ```bash cd agent-toolkit @@ -299,33 +230,21 @@ By default, the agent will spawn a local server and connect to it over Stdio. If 1. Enter your [queries](#sample-queries) at the `User` prompt in the terminal. {:.no_toc} -### Configure to connect to a remote server {#remote} +### Customize the agent -If you want to connect to a remote MCP server, follow this procedure before starting the agent: +To customize the sample agent, you can make changes directly to the Python files. You'll need to [restart the agent](#run-sample) any time you make changes. -1. Start up the MCP server in standalone mode, as described in [Run a standalone server](#run-a-standalone-server). -1. Modify the code in [`basic_agent/agent.py`](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py){: target="_blank"} to set import modules and agent initialization parameters as follows: +{:.no_toc} +#### Customize the model -```python -from google.adk.tools.mcp_tool.mcp_toolset import ( - MCPToolset, - StreamableHTTPConnectionParams -) +To change to a different LLM or model version, edit the `AGENT_MODEL` constant in [packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py#L23){: target="_blank"}. -root_agent = LlmAgent( - # ... - tools=[McpToolset( - connection_params=StreamableHTTPConnectionParams( - url=f"http://:/mcp", - headers={ - "Content-Type": "application/json", - "Accept": "application/json, text/event-stream" - }, - ), - ) - ], -) -``` +{:.no_toc} +#### Customize agent behavior + +The agent's behavior is determined by prompts provided in the `AGENT_INSTRUCTIONS` in [packages/datacommons-mcp/examples/sample_agents/basic_agent/instructions.py](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/instructions.py){: target="_blank"}. + +You can add your own prompts to modify how the agent handles tool results. See the Google ADK page on [LLM agent instructions](https://google.github.io/adk-docs/agents/llm-agents/#guiding-the-agent-instructions-instruction){: target="_blank"} for tips on how to write good prompts. ## Sample queries @@ -340,19 +259,152 @@ Here are some examples of such queries: - "Compare the life expectancy, economic inequality, and GDP growth for BRICS nations." - "Generate a concise report on income vs diabetes in US counties." -## Run a standalone server +{: #self-hosted} +## Advanced: Run your own MCP server + +This section describes how to run the Data Commons MCP server locally, and how to configure a client to connect to it. You can run the client locally or remotely. + +We provide procedures for the following scenarios: +- Local server and local agent: The agent spawns the server in a subprocess using Stdio as the transport protocol. +- Remote server and local agent: You start up the server as a standalone process and then connect the agent to it using streaming HTTP as the protocol. + +For both scenarios, we use Gemini CLI and the sample agent as examples. You should be able to adapt the configurations to other MCP-compliant agents/clients. + +**Additional prerequisities** + +- Install `uv` for managing and installing Python packages; see the instructions at {: target="_blank"}. + +### Run a local server and agent + +{:.no_toc} +#### Gemini CLI -The following procedure starts the MCP server in a local environment. To run the server in Google Cloud against a Custom Data Commons instance, see [Run an MCP server in Google Cloud](/custom_dc/deploy_mcp_cloud.html) +To instruct Gemini CLI to start up a local server using Stdio, replace the `datacommons-mcp` section in your `settings.json` file as follows: -1. Ensure you've set up the relevant server [environment variables](#configure-environment-variables). If you're using a `.env` file, go to the directory where the file is stored. -1. Run: +
+{
+   // ...
+   "mcpServers": {
+      "datacommons-mcp": {
+         "command": "uvx",
+         "args": [
+            "datacommons-mcp@latest",
+            "serve",
+            "stdio"
+         ],
+         // Only needed if you have not set the key in your environment
+         "env": "YOUR DC API KEY"
+      }
+   }
+   // ...
+}
+
+ +[Run Gemini CLI](#run-gemini) as usual. + +{:.no_toc} +#### Sample agent + +To instruct the sample agent to spawn a local server that uses the Stdio protocol, modify [`basic_agent/agent.py`](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py){: target="_blank"} to set import modules and agent initialization parameters as follows: + +```python +from google.adk.tools.mcp_tool.mcp_toolset import ( + McpToolset, + StdioConnectionParams, + StdioServerParameters, +) + +//... + +root_agent = LlmAgent( + model=AGENT_MODEL, + name="basic_agent", + instruction=AGENT_INSTRUCTIONS, + tools=[ + McpToolset( + connection_params=StdioConnectionParams( + timeout=10, + server_params=StdioServerParameters( + command="uvx", + args=["datacommons-mcp", "serve", "stdio"], + env={"DC_API_KEY": DC_API_KEY} + ) + ) + ) + ], +) +``` +[Run the startup commands](#run-sample) as usual. + +### Run a remote server and a local agent + +{:.no_toc} +{: #standalone} +#### Step 1: Start the server as a standalone process + +1. Be sure to set the API key as an [environment variable](#prerequisites). +2. Run:
    uvx datacommons-mcp serve http [--host HOSTNAME] [--port PORT]
    
-By default, the host is `localhost` and the port is `8080` if you don't set these flags explicitly. + By default, the host is `localhost` and the port is `8080` if you don't set these flags explicitly. The server is addressable with the endpoint `mcp`. For example, `http://my-mcp-server:8080/mcp`. -You can connect to the server using [Gemini CLI](#use-gemini-cli) or the [sample ADK agent](#use-the-sample-agent). If you're using a different client from the ones documented on this page, consult its documentation to determine how to specify an HTTP URL. +{: #standalone-client} +{:.no_toc} +#### Step 2: Configure an agent to connect to the running server + +{:.no_toc} +##### Gemini CLI + +Replace the `datacommons-mcp` section in your `settings.json` file as follows: +
+{
+   "mcpServers": {
+      "datacommons-mcp": {
+         "httpUrl": "http://HOST:PORT/mcp",
+         "headers": {
+            "Content-Type": "application/json",
+            "Accept": "application/json, text/event-stream"
+            // If you have set the key in your environment
+           , "X-API-Key": "$DC_API_KEY"
+            // If you have not set the key in your environment
+           , "X-API-Key": "YOUR DC API KEY"
+         }
+      }
+   }
+}
+
+ +[Run Gemini CLI](#run-gemini) as usual. + +{:.no_toc} +##### Sample agent + +Modify [`basic_agent/agent.py`](https://github.com/datacommonsorg/agent-toolkit/blob/main/packages/datacommons-mcp/examples/sample_agents/basic_agent/agent.py){: target="_blank"} as follows: + +
+from google.adk.tools.mcp_tool.mcp_toolset import (
+   MCPToolset,
+   StreamableHTTPConnectionParams
+)
+
+root_agent = LlmAgent(
+      # ...
+      tools=[McpToolset(
+         connection_params=StreamableHTTPConnectionParams(
+            url=f"https://HOST:PORT/mcp",
+            headers={
+               "Content-Type": "application/json",
+               "Accept": "application/json, text/event-stream"
+            }
+         )
+      )
+   ],
+)
+
+ +[Run the startup commands](#run-sample) as usual.