-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Prompt 1:
- Prompt Used: Long Prompt with Schema Provided
- 5ish seconds to pull down GH data, averaging to 0.14 seconds per repository
- Standard model (gpt-4.5)
- Long prompt with chain of thought
- Running the prompt on the set of 15 took 3 min and 38 seconds
- Primarily struggles with BOM and assembly instructions --> for both of these pieces of documentation, brief descriptions of materials/ components and assembly instructions gets incorrectly classified as a present.
| Accuracy | Precision | Recall | F1 Score | True Positive Rate | False Negative Rate |
|---|---|---|---|---|---|
| 0.88 | 0.81 | 0.9 | 0.86 | 0.9 | 0.1 |
Prompt 2:
- Prompt Used: Revised Prompt
- Standard model (gpt-4.5)
- Long prompt, revised prompt number one with greater specificity regarding bill of materials and assembly instructions. I specify that brief descriptions of components do not constitute as a BOM and that brief mentions of assembly instructions do not constitute as assembly instructions.
- Running the prompt took around 4.5 minutes
- Single assembly instruction classified as present when it is actually missing
| Accuracy | Precision | Recall | F1 Score | True Positive Rate | False Negative Rate |
|---|---|---|---|---|---|
| 0.99 | 0.97 | 1 | 0.98 | 1 | 0 |
Prompt 3
- Prompt Used: Revised Prompt, no schema provided
- Standard model (gpt-4.5)
- Running the prompt took 3.24 min
- Struggles mainly with false positives, classifying for the presence of the assembly instructions and the mechanical files when they do not exist.
| Accuracy | Precision | Recall | F1 Score | True Positive Rate | False Negative Rate |
|---|---|---|---|---|---|
| 0.95 | 0.92 | 0.98 | 0.95 | 0.98 | 0.02 |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels