Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
0288bfc
Add workflow reflection agent documentation and update uv.lock
Nov 14, 2025
8f9a147
Merge main into james-dev, resolve uv.lock conflict
Nov 14, 2025
ee1a0f7
add infra deployment
Nov 14, 2025
0bafb9d
add entra id & deployment & change to agent_modules
Nov 15, 2025
94a7723
add secure deployment optiont
Nov 16, 2025
50f7bf3
add CosmosDB as the default state store
Nov 17, 2025
6d176d2
add CosmosDB as the default state store
Nov 17, 2025
928ceea
Merge pull request #335 from microsoft/james-dev
tjsullivan1 Nov 20, 2025
8f6de11
Potential fix for code scanning alert no. 4: Information exposure thr…
tjsullivan1 Dec 9, 2025
bcbc4a8
Converted Fraud Detection UI from Create React App to Vite
DCMattyG Dec 15, 2025
e9175af
Updated Agentic AI React Frontend to proper component structure, upda…
DCMattyG Dec 15, 2025
0f9421f
Updated documentation based on React UI updates
DCMattyG Dec 15, 2025
fa93ef1
Merge pull request #350 from microsoft/int-agentic-ui-updates
james-tn Dec 15, 2025
c3640a6
WIP: Save local changes before switching to int-agentic
Dec 17, 2025
70faaf1
Fix WebSocket reconnect issue and Vite build compatibility
Dec 17, 2025
61a79a4
Merge int-agentic: WebSocket fix and Vite build compatibility
Dec 17, 2025
7aa2cc1
James dev (#351)
james-tn Dec 17, 2025
a90e69a
adding initial commit of terraform code. Moved bicep to sub directory
tjsullivan1 Dec 17, 2025
380beb7
updated iteration variable
tjsullivan1 Dec 17, 2025
68b3919
trying to figure out what changes between the two jobs when it comes …
tjsullivan1 Dec 17, 2025
c06ada1
updated environment for integration test steps
tjsullivan1 Dec 17, 2025
cfb0bd9
updated with tests, changed environment var in terraform steps, remov…
tjsullivan1 Dec 17, 2025
559bbc3
adding a readme for the github workflows
tjsullivan1 Dec 17, 2025
8fb5a5e
added use oidc
tjsullivan1 Dec 19, 2025
672ebfb
added use azuread auth too
tjsullivan1 Dec 19, 2025
5bbd00d
adding orchestrator overlay function
tjsullivan1 Dec 19, 2025
96f9acb
adding orchestrator overlay function, but fixing input name
tjsullivan1 Dec 19, 2025
ac28923
adding permissions to orchestrator layer
tjsullivan1 Dec 19, 2025
2039a33
Updated Orchestrator name
tjsullivan1 Dec 19, 2025
3cce522
updated workflows to segment out destruction of resources
tjsullivan1 Dec 19, 2025
40b7ab3
updated workflows to segment out destruction of resources, fixed inpu…
tjsullivan1 Dec 19, 2025
42303a4
updated orchestrator to run the destroy on dev/my test branch
tjsullivan1 Dec 19, 2025
62b2335
updated orchestrator order of if tjs-infra-as-code
tjsullivan1 Dec 19, 2025
1c8aa1c
updated preflight to ensure storage account is network reachable
tjsullivan1 Dec 19, 2025
a90cf69
added environment to preflight
tjsullivan1 Dec 19, 2025
36a2639
updated with default action
tjsullivan1 Dec 19, 2025
6bbc763
Refactor environment variable logic in workflows
tjsullivan1 Dec 31, 2025
c3e3f76
Update key vault networking settings in orchestrate.yml
tjsullivan1 Dec 31, 2025
40f28b1
Enhance key vault update logic in orchestrate.yml
tjsullivan1 Dec 31, 2025
c80de74
Add dependency on kv_secrets_cabe role assignment
tjsullivan1 Dec 31, 2025
c19eaae
Add dependency on azurerm_role_assignment for lifecycle
tjsullivan1 Dec 31, 2025
015f030
Refactor Key Vault role assignment and add UAMI
tjsullivan1 Dec 31, 2025
03705f5
Fix key vault name substring extraction
tjsullivan1 Dec 31, 2025
c2a1f6b
Merge pull request #353 from microsoft/tjs-infra-as-code
james-tn Jan 6, 2026
f4a971f
Merge origin/int-agentic into james-dev
Jan 6, 2026
3465859
update authentication and bicep deployment to use AAD authentication …
Jan 7, 2026
cb86d3e
complete terraform deployment
Jan 7, 2026
8d21e1f
update DEPLOYMENT and Terraform
Jan 7, 2026
4d78333
update DEPLOYMENT and Terraform
Jan 7, 2026
cc67741
Changed AZURE_OPENAI_API_VERSION to use a variable
tjsullivan1 Jan 7, 2026
7fca542
Reverted the OIDC changes on providers.tf
tjsullivan1 Jan 7, 2026
371d9cf
Reverted the OIDC changes on providers.tf
tjsullivan1 Jan 7, 2026
f911913
Removing key vault referene from orchestration workflow
tjsullivan1 Jan 7, 2026
1b146fe
removing key vault reference and openai secret key from infrastructur…
tjsullivan1 Jan 7, 2026
48a4779
changing docker to build off new image
tjsullivan1 Jan 7, 2026
b379a65
changing docker to build off new image
tjsullivan1 Jan 7, 2026
f968dce
changing docker to build off new image
tjsullivan1 Jan 7, 2026
2d0d524
Making backend config optionally remote in the proper way
tjsullivan1 Jan 8, 2026
421a8f6
Reverting backend change, seems to have broken state connection
tjsullivan1 Jan 8, 2026
324fa5b
adding a local provider file so I can have flexible backends
tjsullivan1 Jan 8, 2026
06e61d9
upgrade version of agent-framework and allow mcp in internal communic…
Jan 8, 2026
24707ee
Merge branch 'james-dev' of https://github.com/microsoft/OpenAIWorksh…
Jan 8, 2026
ce41fb2
Updated to work with both local and remote state
tjsullivan1 Jan 8, 2026
923e8f8
optimize reflection agent code and remove workflow reflection agent
Jan 8, 2026
db269ec
Merge branch 'james-dev' of https://github.com/microsoft/OpenAIWorksh…
Jan 8, 2026
a40610d
add github workflow
Jan 9, 2026
0605e60
update github workflow to use repo level variables
Jan 9, 2026
40542cb
update github workflow to use repo level variables
Jan 9, 2026
7b0776a
update github workflow to use repo level variables
Jan 9, 2026
50f2357
update github workflow to use repo level variables
Jan 9, 2026
393d44d
update test cases & test timeout & excluce MCP test bc mcp is deploye…
Jan 9, 2026
d582c36
move test to after deployment
Jan 9, 2026
ef5ba68
move test to after deployment
Jan 9, 2026
1c2d6fd
fix api version
Jan 9, 2026
a59ac4d
fix api version
Jan 9, 2026
926d65b
fix test run
Jan 9, 2026
66127c0
fix: Use placeholder image for Container Apps initial deployment
Jan 9, 2026
5becdb1
fix: Remove pull_request triggers from Docker workflows
Jan 9, 2026
55a2891
feat: Add james-dev to destroy-infrastructure condition
Jan 9, 2026
aeb5316
feat: Update Bicep for feature parity with Terraform
Jan 9, 2026
b80b119
docs: enhance README with Mermaid diagrams and enterprise deployment …
Jan 9, 2026
31b1b2e
docs: enhance README with Mermaid diagrams and enterprise deployment …
Jan 9, 2026
2c02c3b
refactor: merge MCP backends into unified contoso_tools with env switch
Jan 9, 2026
4b6d071
Update Cosmos DB setup scripts to reference unified backend with USE_…
Jan 9, 2026
bd7a297
Enable MCP deployment with CosmosDB: add all 12 containers, fix env v…
Jan 9, 2026
9551f44
Simplify deploy.ps1 for local-only execution with sensible defaults
Jan 9, 2026
76efa87
Remove unused local.env.ps1 - all config is in dev.tfvars
Jan 9, 2026
d7ec1f1
Updated deployment to reference tfvars file for local file/iteration …
tjsullivan1 Jan 9, 2026
be24677
Merge remote changes, keeping local deploy.ps1
Jan 9, 2026
9fa0c1c
Enterprise Security Infrastructure for Azure OpenAI Workshop (#357)
james-tn Jan 9, 2026
7f6ad9e
update mcp service to support CosmosDB
Jan 9, 2026
91a0e31
Merge main into int-agentic: resolve conflicts, keep Vite build, merg…
Jan 9, 2026
dda54df
Merge main into int-agentic: resolve conflicts, keep Vite build, add …
Jan 9, 2026
9229c7f
add bicep update & MCP with Cosmos
Jan 9, 2026
6c398d5
fix bicep script
Jan 9, 2026
8c08c41
update infra readme and mcp readme for CosmosDB as option for mcp bac…
Jan 9, 2026
2cff6c4
Merge int-agentic into james-dev: Combine Cosmos DB backend and agent…
Jan 9, 2026
4b675c5
Bicep Cosmos DB Backend Parity & Documentation (#363)
james-tn Jan 12, 2026
68441a9
Add permissions for contents in integration tests
tjsullivan1 Jan 12, 2026
763938f
clean up old documentation references
Jan 12, 2026
b764df3
Merge branch 'int-agentic' into james-dev
james-tn Jan 12, 2026
9cb491d
Merge pull request #365 from microsoft/james-dev
james-tn Jan 12, 2026
2a3188f
Merge branch 'main' into int-agentic
james-tn Jan 12, 2026
ed557c3
add ppt
Jan 12, 2026
5274c63
Merge branch 'james-dev' of https://github.com/microsoft/OpenAIWorksh…
Jan 12, 2026
acdc7aa
add ppt
Jan 12, 2026
2910b59
add ppt
Jan 12, 2026
ca35d9e
clean up old documentation references
Jan 12, 2026
fa1c0c6
add bullet point
Jan 12, 2026
9c62328
add database
Jan 12, 2026
b966c4c
add evaluation
Jan 14, 2026
83db3a9
add evaluation
Feb 3, 2026
f2ba847
update CI/CD workflow
Feb 3, 2026
b9b1f03
Merge pull request #375 from microsoft/james-dev
james-tn Feb 3, 2026
92b0cdd
upgrade fraud detection to use new version of the agent-framework
Feb 3, 2026
3755edc
upgrade fraud detection to use new version of the agent-framework
Feb 3, 2026
7f136c1
Fix evaluation framework issues
Feb 4, 2026
d884b91
change fraud detection to durable
Feb 4, 2026
9cd775a
fix eval bugs
Feb 4, 2026
9cc3538
add observability
Feb 4, 2026
f773485
Merge pull request #376 from microsoft/james-dev
james-tn Feb 4, 2026
0237de2
update deployment to include workflow
Feb 5, 2026
e902c1c
Merge pull request #380 from microsoft/james-dev
james-tn Feb 5, 2026
375b777
Fix observability sample: update imports and add proper async context…
Feb 5, 2026
7dc5d86
fix observability bug
Feb 6, 2026
676318d
Merge pull request #381 from microsoft/heena-dev-agentic-ai
james-tn Feb 6, 2026
bbd2fcc
Merge pull request #382 from microsoft/james-dev
james-tn Feb 6, 2026
e9d3e50
Merge main into int-agentic: resolve conflicts in README.md and remov…
Feb 6, 2026
8348389
fix(magentic): update to current agent-framework SDK APIs
Feb 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 190 additions & 0 deletions .github/workflows/agent-evaluation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
name: Agent Evaluation

on:
# Run on PR when agent/evaluation code changes
pull_request:
paths:
- 'agentic_ai/agents/**'
- 'agentic_ai/evaluations/**'

# Allow manual trigger
workflow_dispatch:
inputs:
environment:
description: Target environment
type: choice
options: [dev, integration]
default: dev
agent_name:
description: 'Agent name for evaluation tracking'
type: string
default: 'ci-agent'
limit:
description: 'Limit number of test cases (0 = all)'
type: number
default: 5
eval_type:
description: 'Evaluation type'
type: choice
options: [all, single-turn-only, multi-turn-only]
default: all
push_to_foundry:
description: 'Push results to Azure AI Foundry'
type: boolean
default: false

# Callable from other workflows
workflow_call:
inputs:
environment:
type: string
required: false
default: 'dev'
backend_endpoint:
type: string
required: true
description: 'Backend API endpoint URL'
mcp_endpoint:
type: string
required: true
description: 'MCP service endpoint URL'
agent_name:
type: string
required: false
default: 'ci-agent'
limit:
type: number
required: false
default: 0
push_to_foundry:
type: boolean
required: false
default: false

env:
PYTHON_VERSION: '3.12'

jobs:
# ============================================================================
# Evaluation - Run agent evaluation against test scenarios
# ============================================================================
evaluate:
name: Agent Evaluation
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write # For OIDC authentication

environment: ${{ inputs.environment || 'dev' }}

steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'

- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.cargo/bin" >> $GITHUB_PATH

- name: Install dependencies
run: |
cd agentic_ai/applications
uv sync

- name: Azure Login (OIDC)
if: ${{ inputs.push_to_foundry == true }}
uses: azure/login@v2
with:
client-id: ${{ vars.AZURE_CLIENT_ID }}
tenant-id: ${{ vars.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

- name: Get Azure credentials from Key Vault
if: ${{ inputs.push_to_foundry == true }}
run: |
KEYVAULT_NAME="${{ vars.KEYVAULT_NAME }}"

if [ -n "$KEYVAULT_NAME" ]; then
AOAI_KEY=$(az keyvault secret show --vault-name "$KEYVAULT_NAME" --name "aoai-key" --query value -o tsv 2>/dev/null || echo "")
echo "::add-mask::$AOAI_KEY"
echo "AZURE_OPENAI_API_KEY=$AOAI_KEY" >> $GITHUB_ENV

AI_PROJECT_ENDPOINT=$(az keyvault secret show --vault-name "$KEYVAULT_NAME" --name "ai-project-endpoint" --query value -o tsv 2>/dev/null || echo "")
echo "AZURE_AI_PROJECT_ENDPOINT=$AI_PROJECT_ENDPOINT" >> $GITHUB_ENV
fi

- name: Run Agent Evaluation
run: |
cd agentic_ai/applications

# Build command
CMD="uv run python ../evaluations/run_agent_eval.py"
CMD="$CMD --agent ${{ inputs.agent_name || 'ci-agent' }}"
CMD="$CMD --backend-url ${{ inputs.backend_endpoint || 'http://localhost:7000' }}"

# Add limit if specified
if [ "${{ inputs.limit }}" != "0" ] && [ -n "${{ inputs.limit }}" ]; then
CMD="$CMD --limit ${{ inputs.limit }}"
fi

# Add eval type filter
if [ "${{ inputs.eval_type }}" == "single-turn-only" ]; then
CMD="$CMD --single-turn-only"
elif [ "${{ inputs.eval_type }}" == "multi-turn-only" ]; then
CMD="$CMD --multi-turn-only"
fi

# Add remote flag if pushing to Foundry
if [ "${{ inputs.push_to_foundry }}" == "true" ]; then
CMD="$CMD --remote"
else
CMD="$CMD --local"
fi

echo "Running: $CMD"
$CMD
env:
AZURE_OPENAI_ENDPOINT: ${{ vars.AZURE_OPENAI_ENDPOINT }}
AZURE_OPENAI_CHAT_DEPLOYMENT: ${{ vars.AZURE_OPENAI_DEPLOYMENT }}
AZURE_OPENAI_API_VERSION: '2025-03-01-preview'
MCP_SERVER_URI: ${{ inputs.mcp_endpoint || 'http://localhost:8000/mcp' }}

- name: Upload evaluation results
uses: actions/upload-artifact@v4
if: always()
with:
name: evaluation-results
path: |
agentic_ai/evaluations/eval_results/
agentic_ai/evaluations/evaluation_input_data.jsonl
retention-days: 30

- name: Generate Summary
if: always()
run: |
echo "## 📊 Agent Evaluation Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Setting | Value |" >> $GITHUB_STEP_SUMMARY
echo "|---------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Agent | ${{ inputs.agent_name || 'ci-agent' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Environment | ${{ inputs.environment || 'dev' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Eval Type | ${{ inputs.eval_type || 'all' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Test Limit | ${{ inputs.limit || 'all' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Push to Foundry | ${{ inputs.push_to_foundry || 'false' }} |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "### Metrics Evaluated" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Single-Turn (tool-focused):**" >> $GITHUB_STEP_SUMMARY
echo "- Tool behavior (recall, precision, efficiency)" >> $GITHUB_STEP_SUMMARY
echo "- Completeness, response quality, grounded accuracy" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Multi-Turn (outcome-focused):**" >> $GITHUB_STEP_SUMMARY
echo "- Solution accuracy, task adherence, intent resolution" >> $GITHUB_STEP_SUMMARY
echo "- Coherence, fluency, relevance" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "📁 See artifacts for detailed results" >> $GITHUB_STEP_SUMMARY
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ Welcome to the official repository for the Microsoft AI Agentic Workshop! This r
## Key Features

- **[Microsoft Agent Framework](https://github.com/microsoft/agent-framework) Integration** - Single-agent, multi-agent Magentic orchestration, and handoff-based domain routing with MCP tools. [Pattern guide →](agentic_ai/agents/agent_framework/README.md)
- **[Workflow Orchestration](agentic_ai/workflow/)** - Pregel-style execution, checkpointing, human-in-the-loop patterns, and real-time observability. [Fraud Detection Demo →](agentic_ai/workflow/fraud_detection/)
- **Advanced UI Options** - React frontend with streaming visualization or Streamlit for quick prototyping
- **[Workflow Orchestration](agentic_ai/workflow/)** - Hybrid Workflow + Durable Task architecture with fan-out/fan-in topology, human-in-the-loop, and real-time observability. [Fraud Detection Demo →](agentic_ai/workflow/fraud_detection_durable/)
- **[Observability with Application Insights](agentic_ai/observability/)** - Full tracing of agent executions, tool calls, and LLM invocations with pre-built Grafana dashboards. [Setup Guide →](agentic_ai/observability/README.md)
- **Advanced UI Options** - React frontend with interactive workflow visualization and step-by-step tool call details
- **[MCP Server Integration](mcp/)** - Model Context Protocol for enhanced agent tool capabilities with advanced features: authentication, RBAC, and APIM integration
- **[Emerging Agentic Scenarios](agentic_ai/scenarios/)** - Long-running workflows, progress updates, and durable agent patterns
- **[Agent Evaluations](agentic_ai/evaluations/)** - Evaluate agent performance with custom metrics and test datasets
- **Agent State & History Persistence** - In-memory or CosmosDB backend for conversation history and agent state
- **[Enterprise-Ready Reference Architecture](infra/README.md)** - Production-grade deployment with VNet integration, private endpoints, managed identity, Terraform/Bicep IaC, and GitHub Actions CI/CD

Expand All @@ -46,7 +47,7 @@ Welcome to the official repository for the Microsoft AI Agentic Workshop! This r
1. Review the [Setup Instructions](./SETUP.md) for environment prerequisites and step-by-step installation.
2. Explore the [Business Scenario and Agent Design](./SCENARIO.md) to understand the workshop challenge.
3. Check out the **[Agent Framework Implementation Patterns](agentic_ai/agents/agent_framework/README.md)** to choose the right multi-agent approach (single-agent, Magentic orchestration, or handoff pattern).
4. Try the **[Fraud Detection Workflow Demo](agentic_ai/workflow/fraud_detection/)** to see enterprise orchestration patterns in action.
4. Try the **[Durable Fraud Detection Workflow](agentic_ai/workflow/fraud_detection_durable/)** to see hybrid Workflow + Durable Task orchestration with human-in-the-loop.
5. Dive into [System Architecture](./ARCHITECTURE.md) before building and customizing your agent solutions.
6. Utilize the [Support Guide](./SUPPORT.md) for troubleshooting and assistance.

Expand Down
6 changes: 6 additions & 0 deletions agentic_ai/agents/agent_framework/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Agent Framework module
# This package contains agent implementations built using the agent_framework SDK

from .single_agent import Agent

__all__ = ["Agent"]
Loading
Loading