Skip to content

Commit 8beaf3f

Browse files
author
Dylan Huang
committed
add ui
1 parent eba8448 commit 8beaf3f

File tree

2 files changed

+11
-4
lines changed

2 files changed

+11
-4
lines changed

README.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,13 @@ sophisticated agent evaluations that work across real-world scenarios, from
1212
markdown generation tasks to customer service agents with tool calling
1313
capabilities.
1414

15+
<figure>
16+
<img src="./assets/ui.png" alt="UI" />
17+
<figcaption align="center" style="text-align:center;">
18+
Log Viewer: Monitor your evaluation rollouts in real time.
19+
</figcaption>
20+
</figure>
21+
1522
## Quick Example
1623

1724
Here's a simple test function that checks if a model's response contains **bold** text formatting:
@@ -35,17 +42,17 @@ def test_bold_format(row: EvaluationRow) -> EvaluationRow:
3542
"""
3643
Simple evaluation that checks if the model's response contains bold text.
3744
"""
38-
45+
3946
assistant_response = row.messages[-1].content
40-
47+
4148
# Check if response contains **bold** text
4249
has_bold = "**" in assistant_response
43-
50+
4451
if has_bold:
4552
result = EvaluateResult(score=1.0, reason="✅ Response contains bold text")
4653
else:
4754
result = EvaluateResult(score=0.0, reason="❌ No bold text found")
48-
55+
4956
row.evaluation_result = result
5057
return row
5158
```

assets/ui.png

480 KB
Loading

0 commit comments

Comments
 (0)