Skip to content
Open
1 change: 1 addition & 0 deletions ci/vale/styles/config/vocabularies/nat/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ Pareto
Patronus
PCIe
PDF(s?)
Perplexity
[Pp]luggable
[Pp]ostprocess
[Pp]ostprocessing
Expand Down
29 changes: 29 additions & 0 deletions docs/source/build-workflows/embedders.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ NeMo Agent Toolkit supports the following embedder providers:
| [OpenAI](https://openai.com) | `openai` | OpenAI API |
| [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/quickstart) | `azure_openai` | Azure OpenAI API |
| [Hugging Face](https://huggingface.co) | `huggingface` | Local sentence-transformers or remote Inference Endpoints (TEI) |
| [Perplexity](https://docs.perplexity.ai/api-reference/embeddings-post) | `perplexity` | Perplexity Embeddings API (`pplx-embed-v1-0.6b`, `pplx-embed-v1-4b`, contextualized variants) |

## Embedder Configuration

Expand All @@ -41,6 +42,9 @@ embedders:
azure_openai_embedder:
_type: azure_openai
azure_deployment: text-embedding-3-small
perplexity_embedder:
_type: perplexity
model_name: pplx-embed-v1-0.6b
```

### NVIDIA NIM
Expand Down Expand Up @@ -120,3 +124,28 @@ embedders:
endpoint_url: http://localhost:8081
api_key: ${HF_TOKEN}
```

### Perplexity

Perplexity exposes a dedicated embeddings endpoint at `POST https://api.perplexity.ai/v1/embeddings`. The toolkit ships a native client that batches inputs, decodes the on-wire base64 payload locally, and forwards an `X-Pplx-Integration: nemo-agent-toolkit/<version>` attribution header on every request.

You can use the following environment variables to configure the Perplexity embedder provider:

* `PERPLEXITY_API_KEY` - The API key to access the Perplexity Embeddings API

The Perplexity embedder provider is defined by the {py:class}`~nat.embedder.perplexity_embedder.PerplexityEmbedderModelConfig` class.

* `model_name` - Embedding model identifier. Standard embeddings: `pplx-embed-v1-0.6b` (1024-dim, default) or `pplx-embed-v1-4b` (2560-dim). Document-aware: `pplx-embed-context-v1-0.6b` or `pplx-embed-context-v1-4b`
* `api_key` - Perplexity API key (falls back to `PERPLEXITY_API_KEY`)
* `base_url` - Base URL for the Perplexity API (default: `https://api.perplexity.ai/v1`)
* `dimensions` - Optional Matryoshka output dimension. Range 128-1024 for `0.6b` models and 128-2560 for `4b` models. Omit for full dimensions
* `encoding_format` - On-wire encoding: `base64_int8` (default, signed int8) or `base64_binary` (packed bits for large-scale retrieval)
* `batch_size` - Maximum inputs per request (1-512). Defaults to `64`

```yaml
embedders:
perplexity_embedder:
_type: perplexity
model_name: pplx-embed-v1-0.6b
dimensions: 512 # optional Matryoshka truncation
```
6 changes: 3 additions & 3 deletions docs/source/components/integrations/frameworks.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ NeMo Agent Toolkit provides different levels of support for each framework acros
The ability to use various large language model providers with a framework, including NVIDIA NIM, OpenAI, Azure OpenAI, AWS Bedrock, LiteLLM, and Hugging Face.

### Embedder Provider Support
The ability to use embedding model providers for vector representations, including NVIDIA NIM embeddings, OpenAI embeddings, and Azure OpenAI embeddings.
The ability to use embedding model providers for vector representations, including NVIDIA NIM embeddings, OpenAI embeddings, Azure OpenAI embeddings, and Perplexity embeddings.

### Retriever Provider Support
The ability to integrate with vector databases and retrieval systems, such as NeMo Retriever and Milvus.
Expand Down Expand Up @@ -154,7 +154,7 @@ For more information, visit the [LangChain documentation](https://docs.langchain
| Capability | Providers / Details |
|-------------------------|-------------------------------------------------------------------------------------|
| **LLM Providers** | NVIDIA NIM, OpenAI, Azure OpenAI, AWS Bedrock, LiteLLM, Hugging Face |
| **Embedder Providers** | NVIDIA NIM, OpenAI, Azure OpenAI |
| **Embedder Providers** | NVIDIA NIM, OpenAI, Azure OpenAI, Perplexity |
| **Retriever Providers** | NeMo Retriever, Milvus |
| **Tool Calling** | Fully supported through LangChain's `StructuredTool` interface |
| **Profiling** | Comprehensive profiling support with callback handlers |
Expand All @@ -174,7 +174,7 @@ For more information, visit the [LlamaIndex website](https://www.llamaindex.ai/)
| Capability | Providers / Details |
|-------------------------|-------------------------------------------------------------------------------------|
| **LLM Providers** | NVIDIA NIM, OpenAI, Azure OpenAI, AWS Bedrock, LiteLLM |
| **Embedder Providers** | NVIDIA NIM, OpenAI, Azure OpenAI |
| **Embedder Providers** | NVIDIA NIM, OpenAI, Azure OpenAI, Perplexity |
| **Retriever Providers** | None (Use LlamaIndex native retrievers) |
| **Tool Calling** | Fully supported through LlamaIndex's `FunctionTool` interface |
| **Profiling** | Comprehensive profiling support with callback handlers |
Expand Down
66 changes: 51 additions & 15 deletions docs/source/get-started/tutorials/add-tools-to-a-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,26 +109,26 @@ Workflow Result:
```

## Alternate Method Using a Web Search Tool
Adding individual web pages to a workflow can be cumbersome, especially when dealing with multiple web pages. An alternative method is to use a web search tool. NeMo Agent Toolkit provides two web search tools: `tavily_internet_search` which utilizes the [Tavily Search API](https://tavily.com/), and `exa_internet_search` which utilizes the [Exa Search API](https://exa.ai/).
Adding individual web pages to a workflow can be cumbersome, especially when dealing with multiple web pages. An alternative method is to use a web search tool. NeMo Agent Toolkit provides web search tools including: `perplexity_internet_search` which utilizes the [Perplexity Search API](https://docs.perplexity.ai/api-reference/search-post), `tavily_internet_search` which utilizes the [Tavily Search API](https://tavily.com/), and `exa_internet_search` which utilizes the [Exa Search API](https://exa.ai/).

### Using Tavily Search
### Using Perplexity Search

The `tavily_internet_search` tool is part of the `nvidia-nat[langchain]` package, to install the package run:
The `perplexity_internet_search` tool ships with the core `nvidia-nat` package and is framework-agnostic — it can be used with any of the agent frameworks supported by the toolkit (`langchain`, `llama_index`, `crewai`, `semantic_kernel`, `agno`, `adk`, `strands`, and `autogen`). No framework-specific extra is required to install it:
```bash
# local package install from source
uv pip install -e ".[langchain]"
uv pip install -e .
```

Prior to using the `tavily_internet_search` tool, create an account at [`tavily.com`](https://tavily.com/) and obtain an API key. Once obtained, set the `TAVILY_API_KEY` environment variable to the API key:
Prior to using the `perplexity_internet_search` tool, create a Perplexity account and obtain an API key from the [API key page](https://www.perplexity.ai/account/api/keys). Once obtained, set the `PERPLEXITY_API_KEY` environment variable to the API key (`PPLX_API_KEY` is also accepted as a fallback):
```bash
export TAVILY_API_KEY=<YOUR_TAVILY_API_KEY>
export PERPLEXITY_API_KEY=<YOUR_PERPLEXITY_API_KEY>
```

We will now update the `functions` section of the configuration file replacing the two `webpage_query` tools with a single `tavily_internet_search` tool entry:
We will now update the `functions` section of the configuration file replacing the two `webpage_query` tools with a single `perplexity_internet_search` tool entry:
```yaml
functions:
internet_search:
_type: tavily_internet_search
_type: perplexity_internet_search
current_datetime:
_type: current_datetime
```
Expand All @@ -140,20 +140,56 @@ workflow:
tool_names: [internet_search, current_datetime]
```

The resulting configuration file is located at `examples/documentation_guides/workflows/custom_workflow/search_config.yml` in the NeMo Agent Toolkit repository.

When you re-run the workflow with the updated configuration file:
```bash
nat run --config_file examples/documentation_guides/workflows/custom_workflow/search_config.yml \
nat run --config_file <your_config_file> \
--input "How do I trace only specific parts of my LangChain application?"
```

Which will then yield a slightly different result to the same question:
The `perplexity_internet_search` tool supports additional configuration options:
```yaml
functions:
internet_search:
_type: perplexity_internet_search
max_results: 5
max_retries: 3
max_query_length: 2000 # queries longer than this are truncated
search_recency_filter: week # 'hour', 'day', 'week', 'month', or 'year'
country: US # ISO 3166-1 alpha-2 country code
max_tokens_per_page: 4096
```
Workflow Result:
['To trace only specific parts of a LangChain application, users can use the `@traceable` decorator to mark specific functions or methods as traceable. Additionally, users can configure the tracing functionality to log traces to a specific project, add metadata and tags to traces, and customize the run name and ID. Users can also use the `LangChainTracer` class to trace specific invocations or parts of their application. Furthermore, users can use the `tracing_v2_enabled` context manager to trace a specific block of code.']

### Using Tavily Search

The `tavily_internet_search` tool is part of the `nvidia-nat[langchain]` package, to install the package run:
```bash
# local package install from source
uv pip install -e ".[langchain]"
```

Prior to using the `tavily_internet_search` tool, create an account at [`tavily.com`](https://tavily.com/) and obtain an API key. Once obtained, set the `TAVILY_API_KEY` environment variable to the API key:
```bash
export TAVILY_API_KEY=<YOUR_TAVILY_API_KEY>
```

You can use the `tavily_internet_search` tool by updating the `functions` section of the configuration file:
```yaml
functions:
internet_search:
_type: tavily_internet_search
current_datetime:
_type: current_datetime
```

Then ensure the tool is included in the workflow tool list:
```yaml
workflow:
_type: react_agent
tool_names: [internet_search, current_datetime]
```

A sample configuration file using `tavily_internet_search` is located at `examples/documentation_guides/workflows/custom_workflow/search_config.yml` in the NeMo Agent Toolkit repository.

### Using Exa Search

The `exa_internet_search` tool is also part of the `nvidia-nat[langchain]` package. If you haven't already installed it:
Expand All @@ -167,7 +203,7 @@ Prior to using the `exa_internet_search` tool, create an account at [`exa.ai`](h
export EXA_API_KEY=<YOUR_EXA_API_KEY>
```

You can use the `exa_internet_search` tool in the same way as `tavily_internet_search` by updating the `functions` section of the configuration file:
You can use the `exa_internet_search` tool in the same way as the other web search tools by updating the `functions` section of the configuration file:
```yaml
functions:
internet_search:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import typing

from pydantic import AliasChoices
from pydantic import ConfigDict
from pydantic import Field

from nat.builder.builder import Builder
from nat.builder.embedder import EmbedderProviderInfo
from nat.cli.register_workflow import register_embedder_provider
from nat.data_models.common import OptionalSecretStr
from nat.data_models.embedder import EmbedderBaseConfig
from nat.data_models.retry_mixin import RetryMixin
from nat.data_models.ssl_verification_mixin import SSLVerificationMixin

# Supported model identifiers for the Perplexity Embeddings API.
# Standard embeddings are for independent texts (queries/documents).
# Contextualized embeddings are document-aware; chunks from the same document share context.
PerplexityEmbeddingModel = typing.Literal[
"pplx-embed-v1-0.6b",
"pplx-embed-v1-4b",
"pplx-embed-context-v1-0.6b",
"pplx-embed-context-v1-4b",
]


class PerplexityEmbedderModelConfig(EmbedderBaseConfig, RetryMixin, SSLVerificationMixin, name="perplexity"):
"""A Perplexity Embeddings API provider to be used with an embedder client.

Perplexity exposes a dedicated embeddings endpoint at
``https://api.perplexity.ai/v1/embeddings`` (standard) and
``https://api.perplexity.ai/v1/contextualizedembeddings`` (contextualized).
Authentication uses ``PERPLEXITY_API_KEY``.

Reference: https://docs.perplexity.ai/api-reference/embeddings-post
"""

model_config = ConfigDict(protected_namespaces=(), extra="allow")

api_key: OptionalSecretStr = Field(
default=None,
description="Perplexity API key to interact with the embeddings endpoint. "
"Falls back to the ``PERPLEXITY_API_KEY`` environment variable when unset.",
)
base_url: str = Field(
default="https://api.perplexity.ai/v1",
description="Base URL for the Perplexity API. The embedder appends ``/embeddings`` "
"for standard models and ``/contextualizedembeddings`` for context models.",
)
model_name: PerplexityEmbeddingModel = Field(
default="pplx-embed-v1-0.6b",
validation_alias=AliasChoices("model_name", "model"),
serialization_alias="model",
description="Perplexity embedding model. Standard: ``pplx-embed-v1-0.6b`` (1024-dim) "
"or ``pplx-embed-v1-4b`` (2560-dim). Contextualized: ``pplx-embed-context-v1-0.6b`` "
"or ``pplx-embed-context-v1-4b``.",
)
dimensions: int | None = Field(
default=None,
ge=128,
le=2560,
description="Matryoshka output dimensions. Range is 128–1024 for ``0.6b`` models and "
"128–2560 for ``4b`` models. Defaults to full dimensions when unset.",
)
batch_size: int = Field(
default=64,
ge=1,
le=512,
description="Maximum number of input texts to send per request. The Perplexity API "
"accepts up to 512 inputs per call (subject to a 120,000 total-token cap).",
)
encoding_format: typing.Literal["base64_int8", "base64_binary"] = Field(
default="base64_int8",
description="On-wire encoding for the embedding payload. ``base64_int8`` (default) "
"returns signed int8 values; ``base64_binary`` returns 1-bit-per-dimension packed bits.",
)


@register_embedder_provider(config_type=PerplexityEmbedderModelConfig)
async def perplexity_embedder_model(config: PerplexityEmbedderModelConfig, _builder: Builder):
yield EmbedderProviderInfo(
config=config,
description="A Perplexity Embeddings API model for use with an Embedder client.",
)
1 change: 1 addition & 0 deletions packages/nvidia_nat_core/src/nat/embedder/register.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@
from . import huggingface_embedder
from . import nim_embedder
from . import openai_embedder
from . import perplexity_embedder
Loading