Update README, lock langchain CLI to specific version

Remove some very unused prompts
Add database link to README
2026-05-07 18:45:12 +01:00 · 2026-05-03 21:46:54 +01:00 · 2026-04-09 15:46:18 +01:00 · 2026-04-05 22:47:25 +01:00 · 2026-04-05 12:31:09 +01:00 · 2026-04-05 11:51:28 +01:00
7 changed files with 40 additions and 30 deletions
@@ -1,9 +1,22 @@
 # AI models for identifying trigger events in disinformation analysis
 Final Dissertation Submission Repository
-## Project Description
+## Abstract
 -- todo --
 [Project Presentation](https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/presentation)
 ## Generated Database Link and Usage Experiments
 Generated Dataset Link: [https://huggingface.co/datasets/WillJeynes/LLMsForDisinformationAnalysis-Dataset](https://huggingface.co/datasets/WillJeynes/LLMsForDisinformationAnalysis-Dataset)
 Graph-Based Dataset Visualisation: [https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/](https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/)
 Usage Experiments (incl graph visualisation) Source Code: [https://github.com/WillJeynes/LLMsForDisinformationPrediction](https://github.com/WillJeynes/LLMsForDisinformationPrediction)
 # This repository:
 ## Solution Diagram
 -- todo --
@@ -13,8 +26,6 @@ Final Dissertation Submission Repository
 ## Agent Refinement
 [See agent](/agent/)
 ## Generated Database Link and Usage Experiments
 -- todo --
 ## Repository Structure
 ```
@@ -14,15 +14,19 @@ Experiments modifying pipeline
 Experiments with different model types:
 | Model                         | % Correct | % Change |
 |-------------------------------|----------:|---------:|
-| gpt-5-mini                    | 33        | 0        |
+| gpt-5-mini                    | 45.51     |          |
-| gpt-5.4-mini                  | 32.4      | -0.02    |
+| gpt-5.4-mini                  | 32.4      |          |
-| llama3.1:8b-instruct-q4_K_M   | ?         | ?        |
+| gpt-5.4-nano                  | 23.28     |          |
-| qwen3.5:9b                    | 0         | -100     |
+| gpt-4.1-mini                  | 27.85     |          |
 | gpt-4o-mini                   | 32.47     |          |
 | llama3.1:8b-instruct-q4_K_M   | ?         |          |
 | qwen3.5:9b                    | 0         |          |
 %age valid URLS
 | Model                         | Number    | % Age    |
 |-------------------------------|----------:|---------:|
 | gpt-5-mini                    | 22/405    | 5.43     |
 | gpt-5.4-mini                  | 29/278    | 10.43    |
-| llama3.1:8b-instruct-q4_K_M   | ?         | ?        |
+| gpt-5.4-nano                  | 6/210     | 2.85     |
-| qwen3.5:9b                    | 0         | 0        |
+| gpt-4.1-mini                  | 15/269    | 5.57     |
 | gpt-4o-mini                   | 27/287    | 9.407    |
@@ -1,9 +0,0 @@
 Could the following real-world event:
 ###TECLAIM###
 Be a trigger for the following disinformation:
 ###TITLE###
 Respond with "RELATION", followed by : followed by a confidence score (VERYHIGH, HIGH, MEDIUM, LOW, VERYLOW) followed by : followed by the reason. Use no other words, just return the score and reason in format.
 Ignore wether the event happened or not, purely consider the likiness of causation
@@ -1,8 +0,0 @@
 Do the search results cited below
 ###TESEARCH###
 Support the idea that the following happened:
 ###TECLAIM###
 Respond with "CONFIDENCE", followed by : followed by a confidence score (VERYHIGH, HIGH, MEDIUM, LOW, VERYLOW) followed by : followed by the reason. Use no other words, just return the score and reason in format.
 Dates can be off by a few days, that would still be valid
@@ -5,7 +5,7 @@ set -e
 run_agent () {
    echo "Starting LangGraph agent..."
    cd agent
-    npx @langchain/langgraph-cli dev
+    npx @langchain/langgraph-cli@1.1.17 dev
 }
 run_ensemble_service () {
@@ -19,6 +19,9 @@ const MODE = process.env.MODE ?? "claim";
 const MAX_CONCURRENCY = 5;
 const OFFSET = parseInt(process.env.OFFSET ?? "0", 10);
 const LIMIT = process.env.LIMIT ? parseInt(process.env.LIMIT, 10) : null;
 const client = new Client({ apiUrl: API_URL });
@@ -164,9 +167,18 @@ async function processRecord(record: any): Promise<ResultRecord> {
 async function main() {
  console.log("Reading input file...");
-  const records = await loadInputs();
+  const allRecords = await loadInputs();
-  console.log(`Loaded ${records.length} records`);
+  console.log(`Loaded ${allRecords.length} records`);
  const records = allRecords.slice(
    OFFSET,
    LIMIT !== null ? OFFSET + LIMIT : undefined
  );
  console.log(
    `Processing ${records.length} records (offset=${OFFSET}, limit=${LIMIT ?? "∞"})`
  );
  fs.writeFileSync(OUTPUT_FILE, "", { flag: "a" });
@@ -27,7 +27,7 @@ DEFAULT_PARAMS = [
    ("organization", "http://weverify.eu/resource/Organization/3727f7b2aa90ec0716693e5464b28d18"), # StopFake
 ]
-NUM_RANDOM_CLAIMS = 200
+NUM_RANDOM_CLAIMS = 2000
 INPUT_FILE = "../../data/input.jsonl"
 OUTPUT_FILE = "../../data/claims.json"
Author	SHA1	Message	Date
William Jeynes	4e0bab9897	Update README, lock langchain CLI to specific version	2026-05-07 18:45:12 +01:00
William Jeynes	c4dac3f515	Remove some very unused prompts	2026-05-03 21:46:54 +01:00
William Jeynes	2252a42466	Add database link to README	2026-04-09 15:46:18 +01:00
William Jeynes	75ca1032a6	Add offset and limit in pereparation for the large dataset	2026-04-05 22:47:25 +01:00
William Jeynes	00d129bd28	add % valid URLs for different model	2026-04-05 12:31:09 +01:00
William Jeynes	cf923d6e87	Add new accuracy results	2026-04-05 11:51:28 +01:00