6 Commits

Author SHA1 Message Date
William Jeynes 4e0bab9897 Update README, lock langchain CLI to specific version 2026-05-07 18:45:12 +01:00
William Jeynes c4dac3f515 Remove some very unused prompts 2026-05-03 21:46:54 +01:00
William Jeynes 2252a42466 Add database link to README 2026-04-09 15:46:18 +01:00
William Jeynes 75ca1032a6 Add offset and limit in pereparation for the large dataset 2026-04-05 22:47:25 +01:00
William Jeynes 00d129bd28 add % valid URLs for different model 2026-04-05 12:31:09 +01:00
William Jeynes cf923d6e87 Add new accuracy results 2026-04-05 11:51:28 +01:00
9 changed files with 41 additions and 40 deletions
+14 -3
View File
@@ -1,9 +1,22 @@
# AI models for identifying trigger events in disinformation analysis
Final Dissertation Submission Repository
## Project Description
## Abstract
-- todo --
[Project Presentation](https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/presentation)
## Generated Database Link and Usage Experiments
Generated Dataset Link: [https://huggingface.co/datasets/WillJeynes/LLMsForDisinformationAnalysis-Dataset](https://huggingface.co/datasets/WillJeynes/LLMsForDisinformationAnalysis-Dataset)
Graph-Based Dataset Visualisation: [https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/](https://jillweynes.github.io/LLMsForDisinformationPrediction-GraphVizBuilt/)
Usage Experiments (incl graph visualisation) Source Code: [https://github.com/WillJeynes/LLMsForDisinformationPrediction](https://github.com/WillJeynes/LLMsForDisinformationPrediction)
# This repository:
## Solution Diagram
-- todo --
@@ -13,8 +26,6 @@ Final Dissertation Submission Repository
## Agent Refinement
[See agent](/agent/)
## Generated Database Link and Usage Experiments
-- todo --
## Repository Structure
```
+10 -6
View File
@@ -14,15 +14,19 @@ Experiments modifying pipeline
Experiments with different model types:
| Model | % Correct | % Change |
|-------------------------------|----------:|---------:|
| gpt-5-mini | 33 | 0 |
| gpt-5.4-mini | 32.4 | -0.02 |
| llama3.1:8b-instruct-q4_K_M | ? | ? |
| qwen3.5:9b | 0 | -100 |
| gpt-5-mini | 45.51 | |
| gpt-5.4-mini | 32.4 | |
| gpt-5.4-nano | 23.28 | |
| gpt-4.1-mini | 27.85 | |
| gpt-4o-mini | 32.47 | |
| llama3.1:8b-instruct-q4_K_M | ? | |
| qwen3.5:9b | 0 | |
%age valid URLS
| Model | Number | % Age |
|-------------------------------|----------:|---------:|
| gpt-5-mini | 22/405 | 5.43 |
| gpt-5.4-mini | 29/278 | 10.43 |
| llama3.1:8b-instruct-q4_K_M | ? | ? |
| qwen3.5:9b | 0 | 0 |
| gpt-5.4-nano | 6/210 | 2.85 |
| gpt-4.1-mini | 15/269 | 5.57 |
| gpt-4o-mini | 27/287 | 9.407 |
+1 -1
View File
@@ -9,7 +9,7 @@ export function createModelNode(tools: any, promptPath: string): GraphNode<typeo
const sysPrompt = await hydratePrompt(promptPath, state);
const model = new ChatOpenAI({
model: "gpt-4.1-mini"
model: "gpt-5-mini"
});
const modelWithTools = model.bindTools(Object.values(tools));
-9
View File
@@ -1,9 +0,0 @@
Could the following real-world event:
###TECLAIM###
Be a trigger for the following disinformation:
###TITLE###
Respond with "RELATION", followed by : followed by a confidence score (VERYHIGH, HIGH, MEDIUM, LOW, VERYLOW) followed by : followed by the reason. Use no other words, just return the score and reason in format.
Ignore wether the event happened or not, purely consider the likiness of causation
-9
View File
@@ -8,10 +8,6 @@ Produce up-to 5 specific "trigger events" that happened that could have led to t
Remember the time frame of the disinformation campaign: ###CDATE###
Include no information or events that would not have been available at the time.
You MEED TO use the tools available to you in order to produce up to date information on URL and search query, else you will be wrong and the analysis invalid.
You NEED TO use the web search and open URL tools to ensure page validity or else all work upto this point will have to be discarded.
Produce no more text other than the json.
Include a concise but specific search query that can be looked up on a search engine in order to allow for the verification.
@@ -30,9 +26,4 @@ Events will be reordered as part of processing, each statement must stand alone
The preceeding messages act as examples of previous responses to potentially ficitonal events and scores given.
Analysis should only be completed for proposed events that would graner >0.7 points
This pipeline is running well pasy your knowledge cutoff.
Any URLs will change signigicantly over time.
You MEED TO use the tools available to you in order to produce up to date information on URL and search query, else you will be wrong and the analysis invalid.
You NEED TO use the web search and open URL tools to ensure page validity or else all work upto this point will have to be discarded.
Lets go through it step by step
-8
View File
@@ -1,8 +0,0 @@
Do the search results cited below
###TESEARCH###
Support the idea that the following happened:
###TECLAIM###
Respond with "CONFIDENCE", followed by : followed by a confidence score (VERYHIGH, HIGH, MEDIUM, LOW, VERYLOW) followed by : followed by the reason. Use no other words, just return the score and reason in format.
Dates can be off by a few days, that would still be valid
+1 -1
View File
@@ -5,7 +5,7 @@ set -e
run_agent () {
echo "Starting LangGraph agent..."
cd agent
npx @langchain/langgraph-cli dev --host 127.0.0.1
npx @langchain/langgraph-cli@1.1.17 dev
}
run_ensemble_service () {
+14 -2
View File
@@ -19,6 +19,9 @@ const MODE = process.env.MODE ?? "claim";
const MAX_CONCURRENCY = 5;
const OFFSET = parseInt(process.env.OFFSET ?? "0", 10);
const LIMIT = process.env.LIMIT ? parseInt(process.env.LIMIT, 10) : null;
const client = new Client({ apiUrl: API_URL });
@@ -164,10 +167,19 @@ async function processRecord(record: any): Promise<ResultRecord> {
async function main() {
console.log("Reading input file...");
const records = await loadInputs();
const allRecords = await loadInputs();
console.log(`Loaded ${records.length} records`);
console.log(`Loaded ${allRecords.length} records`);
const records = allRecords.slice(
OFFSET,
LIMIT !== null ? OFFSET + LIMIT : undefined
);
console.log(
`Processing ${records.length} records (offset=${OFFSET}, limit=${LIMIT ?? "∞"})`
);
fs.writeFileSync(OUTPUT_FILE, "", { flag: "a" });
const limit = pLimit(MAX_CONCURRENCY);
+1 -1
View File
@@ -27,7 +27,7 @@ DEFAULT_PARAMS = [
("organization", "http://weverify.eu/resource/Organization/3727f7b2aa90ec0716693e5464b28d18"), # StopFake
]
NUM_RANDOM_CLAIMS = 200
NUM_RANDOM_CLAIMS = 2000
INPUT_FILE = "../../data/input.jsonl"
OUTPUT_FILE = "../../data/claims.json"