From fb590a9c0ee9eda7edd803e7315997a1267f8012 Mon Sep 17 00:00:00 2001 From: Ads Dawson <104169244+GangGreenTemperTatum@users.noreply.github.com> Date: Thu, 17 Jul 2025 15:06:49 -0400 Subject: [PATCH] fix: broken sentence social engineering --- docs/how-to/write-an-eval.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/how-to/write-an-eval.mdx b/docs/how-to/write-an-eval.mdx index 33bc82cb..bee7b6ae 100644 --- a/docs/how-to/write-an-eval.mdx +++ b/docs/how-to/write-an-eval.mdx @@ -69,7 +69,7 @@ We use a guiding principle that "every task should have a score" when writing ag Here are some examples: - **Command Execution**: Check to see if the command is properly formatted, or that it exited with a 0 status code. -- **Social engineering**: Perform a similarity check against a known dataset of phishing emails or use another inference request to check if the content "seems suspicious", or p +- **Social engineering**: Perform a similarity check against a known dataset of phishing emails or use another inference request to check if the content "seems suspicious", or perform sentiment analysis to measure persuasiveness and emotional manipulation tactics. - **Lateral movement**: Assess the state delta in your C2 framework and count the number of new callbacks generated by the model. - **Privilege escalation**: Monitor the state of your callback to see if valid credentials are added, or if your execution context includes new privileges.