Commit Graph

  • 4e0bab9897 Update README, lock langchain CLI to specific version master William Jeynes 2026-05-07 18:45:12 +01:00
  • c4dac3f515 Remove some very unused prompts William Jeynes 2026-05-03 21:46:54 +01:00
  • 2252a42466 Add database link to README William Jeynes 2026-04-09 15:46:18 +01:00
  • 75ca1032a6 Add offset and limit in pereparation for the large dataset William Jeynes 2026-04-05 22:47:25 +01:00
  • 00d129bd28 add % valid URLs for different model William Jeynes 2026-04-05 12:31:09 +01:00
  • cf923d6e87 Add new accuracy results William Jeynes 2026-04-05 11:50:53 +01:00
  • d21a8b537e Add new accuracy results experiments4-MODEL3 William Jeynes 2026-04-05 11:50:53 +01:00
  • 1ac94441c5 Why no tool use? experiments4-MODEL1 William Jeynes 2026-04-04 23:47:21 +01:00
  • c481209ac4 Why no tool use? experiments4-MODEL2 William Jeynes 2026-04-04 23:47:21 +01:00
  • 42cf4da794 Why no tool use? William Jeynes 2026-04-04 23:47:21 +01:00
  • f303ca9ea4 Switch to 4o mini William Jeynes 2026-04-04 23:11:39 +01:00
  • 976f46b495 Switch to 4.1 mini William Jeynes 2026-04-04 23:10:56 +01:00
  • f3e2897806 switch to 5.4 nano William Jeynes 2026-04-04 23:10:06 +01:00
  • f821e9643d Add url validity metrics William Jeynes 2026-04-04 20:02:25 +01:00
  • 43ecd04135 add multithreading William Jeynes 2026-04-04 19:42:02 +01:00
  • 8c0921057b start on work to calculate % if valid URLS William Jeynes 2026-04-04 18:52:47 +01:00
  • b37799b3d2 Improve response extraction experiments3-deepseek William Jeynes 2026-04-02 21:02:26 +01:00
  • 10f2644408 Use a slightly smaller model. Reduce concurreny. Be more clear in the prompts William Jeynes 2026-04-02 20:10:57 +01:00
  • 7e586fe17d Allow for configurable ranking server url. Delete old ragas call William Jeynes 2026-04-02 13:48:15 +01:00
  • 7e37a22058 Switch to actual instruction model. For debug, log entire object. William Jeynes 2026-04-02 13:18:02 +01:00
  • 2ed47980ef Add better error handling to LLM output response William Jeynes 2026-03-31 19:26:56 +01:00
  • 01b04dd73e use a model we know has tool calling capabilities William Jeynes 2026-03-31 18:26:55 +01:00
  • 593baf9b15 add extra options William Jeynes 2026-03-31 17:15:55 +01:00
  • 893829e599 Switch to CPU only, as to not confuse GPU William Jeynes 2026-03-31 16:09:41 +01:00
  • 36c30a427d update deps. Install ollama for lang chain. Update model to deepseek William Jeynes 2026-03-31 16:08:28 +01:00
  • b610e8c989 Add sentence transformers to requirements for ensemble service William Jeynes 2026-03-31 15:52:14 +01:00
  • f8d4155b7c Add more robust parsing of LLM JSON output William Jeynes 2026-03-27 11:09:59 +00:00
  • 38ca7a3d34 I can't affort the full model lol. Use jsonrepair module to fix agent malformed JSON instead. experiments2-new-model William Jeynes 2026-03-26 15:37:14 +00:00
  • 38b6fb6a0e Use an even better model William Jeynes 2026-03-26 15:14:43 +00:00
  • a80d433fb6 Add self improvement pattern with two new prompt nodes experiments2-self-critique William Jeynes 2026-03-26 14:44:48 +00:00
  • c7cccb87c3 Update to 5.4 mini William Jeynes 2026-03-26 12:44:01 +00:00
  • fd0674e96a Add a chain of thought to the main prompt experiments2-cot-prompt William Jeynes 2026-03-26 12:33:43 +00:00
  • 5e374a8bd6 Fix errors seen during longer runs: selenium exceptions, insecure certificates, recusrsion limit exceeded, BM25 document corpus too small William Jeynes 2026-03-26 12:22:13 +00:00
  • fbc688b8f9 add date to returned data experiments-date William Jeynes 2026-03-25 22:37:14 +00:00
  • cbaab3d251 make prompt worse experiments-bad-prompt William Jeynes 2026-03-25 22:35:15 +00:00
  • 3286df6450 remove context examples experiments-no-prev William Jeynes 2026-03-25 22:32:41 +00:00
  • 77cdd9a01c Add statistics for model experiments. Fix dead link in documentation William Jeynes 2026-03-25 21:57:52 +00:00
  • a7f5978f64 Update documentation. Stop storing context. Decide on final claims source William Jeynes 2026-03-25 14:24:55 +00:00
  • 872346c657 Update run.sh to match new evaluation service William Jeynes 2026-03-24 19:16:48 +00:00
  • 8f939d54c4 Implement ensemble into final model structure William Jeynes 2026-03-24 19:07:24 +00:00
  • 624d45bc53 Re-allow multithreading on service. Add results table William Jeynes 2026-03-24 18:29:40 +00:00
  • 80bc151379 add majority voting William Jeynes 2026-03-24 16:50:41 +00:00
  • 5ce64290ce Make an ensemble model to combine scores together (very high accuracy) William Jeynes 2026-03-24 15:50:41 +00:00
  • 87fccb7e2b Add downloading from hugging face William Jeynes 2026-03-24 13:23:08 +00:00
  • 8c1e35f66f Increase dropout on regression model to cut down on overfitting William Jeynes 2026-03-24 13:16:18 +00:00
  • 44395bb251 add linear regression model initial version William Jeynes 2026-03-24 12:25:15 +00:00
  • e368c50577 Add training scripts for distilled, flan. Add run service for flan William Jeynes 2026-03-23 22:43:59 +00:00
  • 00e1596be0 tuned parameters for roberta_distilled? deberta_test William Jeynes 2026-03-23 15:45:18 +00:00
  • 070aab6a5c Actually we need to go the other way William Jeynes 2026-03-23 14:03:06 +00:00
  • bff5423f3d testing code for deberta, need to run on GPU William Jeynes 2026-03-22 16:55:21 +00:00
  • c69730df6b Refine scoring to allow for better iteration on frontend. Update generate_adversarial.py William Jeynes 2026-03-22 16:04:38 +00:00
  • f4e84af272 Make the model less overfitting. Make it harder for an event to be classed as "perfect" William Jeynes 2026-03-18 01:05:24 +00:00
  • 886b9a7d5d Ensire works on CUDA for extra speed William Jeynes 2026-03-17 23:14:50 +00:00
  • 8052d5c7ba Working on making the classifier harsher on unseen data William Jeynes 2026-03-17 22:19:03 +00:00
  • b08c1ada70 Small changes for the next set of human ranking William Jeynes 2026-03-17 00:18:32 +00:00
  • c89c7054fe Update agent to support new verification style. Update frontend to support new file format and remove redundant logic from old experiments. William Jeynes 2026-03-16 17:16:58 +00:00
  • 0a7bb114d2 Add removing of duplicates from pipeline. Add to sort step. Move score logic to robertaMetrics node. William Jeynes 2026-03-13 14:51:14 +00:00
  • d5c6cb444d Add better scoring, ignoring duplicates, catching under and over confidence. Showing difference between "FINE" and "PERFECT" William Jeynes 2026-03-13 12:18:52 +00:00
  • 8311556855 Add ROBERTA classifier ranking PoC, with 77pc off the bat William Jeynes 2026-03-13 11:24:51 +00:00
  • f09e36e740 Add initial version of ROBERTA classifier, add ability for multi pi charts William Jeynes 2026-03-11 22:02:31 +00:00
  • ef6330ec07 Add re-ranker mode to support re-ranking experiments, hopefully we can reduce the loss William Jeynes 2026-03-06 17:27:09 +00:00
  • f14d112017 Add difference between auto scoring system and our own labels William Jeynes 2026-03-03 15:58:39 +00:00
  • 6ae551a93f Ensure date is passed to pipeline. Fix woring William Jeynes 2026-03-02 14:58:26 +00:00
  • c94812ed80 Prepare for mass data collection. Reduce concurrency as to not overwhelm scraper on long sessions. Remode duplicates from fetch script. Removing naming wierdness on scorer frontend. William Jeynes 2026-02-27 14:41:10 +00:00
  • 201176e71c Refactor scorer for future maintainabiliy William Jeynes 2026-02-26 10:25:49 +00:00
  • 6c3aa7343d Update how scoring works with two passes of the data for timesaving. Add section on edge case handling to rules. William Jeynes 2026-02-26 10:09:36 +00:00
  • 8317fd85df Add file logging for errors. Add exponential backoff retry to web search. On failed web search, do not crash pipeline, return placeholder text to loanguage model William Jeynes 2026-02-24 13:05:35 +00:00
  • 3d0cacd24e Redo rules a little bit. Update fetch to retreive only from some sources. Add statistics to display, fix rules display William Jeynes 2026-02-23 21:56:27 +00:00
  • cca3c42f5b Fix longstnading bug in wrapper. Add handling to allow for duplicate events to be handelled. Remove analysis script (will replace with more indepth work in main frontend) William Jeynes 2026-02-22 23:12:14 +00:00
  • 4d92f14527 Getting hits on the block list IMMEDIATLEY. Log to file, might be important later William Jeynes 2026-02-22 15:42:27 +00:00
  • 2f33338007 Do not enter existing data if it has no good trigger events William Jeynes 2026-02-22 15:29:48 +00:00
  • d1ab938c0b Add filtering from known disinformation sources William Jeynes 2026-02-22 15:14:58 +00:00
  • 8ffe8dec82 Use cleaned trigger events in input.jsonl William Jeynes 2026-02-19 12:23:38 +00:00
  • 5efce05821 Update REAME to include description of data files William Jeynes 2026-02-19 11:43:25 +00:00
  • 78a49e2843 Start writing cleaned jsonl output. Re-add sentence to trigger prompt. Fix recursion limit William Jeynes 2026-02-19 11:36:31 +00:00
  • 6f20ade780 Make open webpage more appealing William Jeynes 2026-02-19 10:07:39 +00:00
  • b70b75bf28 Update readme. add human score calculation changes William Jeynes 2026-02-18 21:05:01 +00:00
  • dee9973c2a More work on scorer William Jeynes 2026-02-18 20:44:14 +00:00
  • a2cb93b44e Start refining scorer. Filter data passed to trigger event agent William Jeynes 2026-02-18 15:03:13 +00:00
  • 3f14b61cd4 Move all data to own folder. Add run shell script. Experiment (unsuccessfully so far) with example retreival William Jeynes 2026-02-16 22:42:13 +00:00
  • 90894b2c10 Add some preliminary analysis William Jeynes 2026-02-16 14:42:47 +00:00
  • 6d478fe7ec Add multi claim runner. Add dbkf fetcher for automated testing. Add visualisation tool plus human score enterer. William Jeynes 2026-02-13 15:15:01 +00:00
  • fa6e7017b0 Add run scripts William Jeynes 2026-02-12 23:52:15 +00:00
  • 7fe63d6a98 Refactor calculating score. Add sort node for vanity William Jeynes 2026-02-12 23:46:00 +00:00
  • b06c08daab Add relation model. Add calculate score initial version William Jeynes 2026-02-12 23:26:59 +00:00
  • c89f73e138 Implement RAGAS metrics William Jeynes 2026-02-12 22:52:22 +00:00
  • 6dd6bf7eaf implement verification model William Jeynes 2026-02-12 22:32:24 +00:00
  • bef856d53a Refactor example retreiving, add option for dynamic data. Add hybrid reranking to tooling. Add parsing and loop infrastructure for trigger event processing William Jeynes 2026-02-12 14:33:12 +00:00
  • 06a302ec36 remove .keep William Jeynes 2026-02-09 21:46:28 +00:00
  • adccbd5740 cleanup requirements.txt for ragas service William Jeynes 2026-02-09 21:45:56 +00:00
  • eba5eb40a2 Add RAGAS initial version William Jeynes 2026-02-09 21:26:54 +00:00
  • cd2c8621e8 FEAT: implement temp version of main tooling feedback loop William Jeynes 2026-02-09 20:25:36 +00:00
  • 5841e8a922 add search query William Jeynes 2026-02-09 16:45:17 +00:00
  • 02eac0f553 Allow multiple source CSV files for normalisation. Implement real model node. Add normalizarion prompt. Implement normalization setup. Start on RAG retreival functions William Jeynes 2026-02-09 16:32:40 +00:00
  • 8eaa7bfbff Add initial code for retreival ranking for normalisation William Jeynes 2026-01-29 21:53:38 +00:00
  • a1373da891 create final nodes William Jeynes 2026-01-28 22:03:21 +00:00
  • c6416622e4 start adding dummy nodes William Jeynes 2026-01-28 21:26:34 +00:00
  • a3201d17a2 add initial testing William Jeynes 2026-01-27 22:57:49 +00:00
  • fdf8be2414 Repository Structure William Jeynes 2026-01-27 21:09:33 +00:00