From cf923d6e87976db2be2d100d03036944084bbe4a Mon Sep 17 00:00:00 2001 From: William Jeynes Date: Sun, 5 Apr 2026 11:50:53 +0100 Subject: [PATCH] Add new accuracy results --- agent/README.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/agent/README.md b/agent/README.md index 88e9fbe..1b57423 100644 --- a/agent/README.md +++ b/agent/README.md @@ -14,10 +14,13 @@ Experiments modifying pipeline Experiments with different model types: | Model | % Correct | % Change | |-------------------------------|----------:|---------:| -| gpt-5-mini | 33 | 0 | -| gpt-5.4-mini | 32.4 | -0.02 | -| llama3.1:8b-instruct-q4_K_M | ? | ? | -| qwen3.5:9b | 0 | -100 | +| gpt-5-mini | 45.51 | | +| gpt-5.4-mini | 32.4 | | +| gpt-5.4-nano | 23.28 | | +| gpt-4.1-mini | 27.85 | | +| gpt-4o-mini | 32.47 | | +| llama3.1:8b-instruct-q4_K_M | ? | | +| qwen3.5:9b | 0 | | %age valid URLS | Model | Number | % Age |