10f2644408
Use a slightly smaller model. Reduce concurreny. Be more clear in the prompts
William Jeynes
2026-04-02 20:10:57 +01:00
7e586fe17d
Allow for configurable ranking server url. Delete old ragas call
William Jeynes
2026-04-02 13:48:15 +01:00
7e37a22058
Switch to actual instruction model. For debug, log entire object.
William Jeynes
2026-04-02 13:18:02 +01:00
2ed47980ef
Add better error handling to LLM output response
William Jeynes
2026-03-31 19:26:56 +01:00
01b04dd73e
use a model we know has tool calling capabilities
William Jeynes
2026-03-31 18:26:55 +01:00
593baf9b15
add extra options
William Jeynes
2026-03-31 17:15:55 +01:00
893829e599
Switch to CPU only, as to not confuse GPU
William Jeynes
2026-03-31 16:09:41 +01:00
36c30a427d
update deps. Install ollama for lang chain. Update model to deepseek
William Jeynes
2026-03-31 16:08:28 +01:00
b610e8c989
Add sentence transformers to requirements for ensemble service
William Jeynes
2026-03-31 15:52:14 +01:00
f8d4155b7c
Add more robust parsing of LLM JSON output
William Jeynes
2026-03-27 11:09:59 +00:00
38ca7a3d34
I can't affort the full model lol. Use jsonrepair module to fix agent malformed JSON instead.
experiments2-new-model
William Jeynes
2026-03-26 15:37:14 +00:00
38b6fb6a0e
Use an even better model
William Jeynes
2026-03-26 15:14:43 +00:00
5e374a8bd6
Fix errors seen during longer runs: selenium exceptions, insecure certificates, recusrsion limit exceeded, BM25 document corpus too small
William Jeynes
2026-03-26 12:22:13 +00:00
77cdd9a01c
Add statistics for model experiments. Fix dead link in documentation
William Jeynes
2026-03-25 21:57:52 +00:00
a7f5978f64
Update documentation. Stop storing context. Decide on final claims source
William Jeynes
2026-03-25 14:24:55 +00:00
872346c657
Update run.sh to match new evaluation service
William Jeynes
2026-03-24 19:16:48 +00:00
8f939d54c4
Implement ensemble into final model structure
William Jeynes
2026-03-24 19:07:24 +00:00
624d45bc53
Re-allow multithreading on service. Add results table
William Jeynes
2026-03-24 18:29:40 +00:00
80bc151379
add majority voting
William Jeynes
2026-03-24 16:50:41 +00:00
5ce64290ce
Make an ensemble model to combine scores together (very high accuracy)
William Jeynes
2026-03-24 15:50:41 +00:00
87fccb7e2b
Add downloading from hugging face
William Jeynes
2026-03-24 13:23:08 +00:00
8c1e35f66f
Increase dropout on regression model to cut down on overfitting
William Jeynes
2026-03-24 13:16:18 +00:00
44395bb251
add linear regression model initial version
William Jeynes
2026-03-24 12:25:15 +00:00
e368c50577
Add training scripts for distilled, flan. Add run service for flan
William Jeynes
2026-03-23 22:43:59 +00:00
00e1596be0
tuned parameters for roberta_distilled?
deberta_test
William Jeynes
2026-03-23 15:45:18 +00:00
070aab6a5c
Actually we need to go the other way
William Jeynes
2026-03-23 14:03:06 +00:00
bff5423f3d
testing code for deberta, need to run on GPU
William Jeynes
2026-03-22 16:55:21 +00:00
c69730df6b
Refine scoring to allow for better iteration on frontend. Update generate_adversarial.py
William Jeynes
2026-03-22 16:04:38 +00:00
f4e84af272
Make the model less overfitting. Make it harder for an event to be classed as "perfect"
William Jeynes
2026-03-18 01:05:24 +00:00
886b9a7d5d
Ensire works on CUDA for extra speed
William Jeynes
2026-03-17 23:14:50 +00:00
8052d5c7ba
Working on making the classifier harsher on unseen data
William Jeynes
2026-03-17 22:19:03 +00:00
b08c1ada70
Small changes for the next set of human ranking
William Jeynes
2026-03-17 00:18:32 +00:00
c89c7054fe
Update agent to support new verification style. Update frontend to support new file format and remove redundant logic from old experiments.
William Jeynes
2026-03-16 17:16:58 +00:00
0a7bb114d2
Add removing of duplicates from pipeline. Add to sort step. Move score logic to robertaMetrics node.
William Jeynes
2026-03-13 14:51:14 +00:00
d5c6cb444d
Add better scoring, ignoring duplicates, catching under and over confidence. Showing difference between "FINE" and "PERFECT"
William Jeynes
2026-03-13 12:18:52 +00:00
8311556855
Add ROBERTA classifier ranking PoC, with 77pc off the bat
William Jeynes
2026-03-13 11:24:51 +00:00
f09e36e740
Add initial version of ROBERTA classifier, add ability for multi pi charts
William Jeynes
2026-03-11 22:02:31 +00:00
ef6330ec07
Add re-ranker mode to support re-ranking experiments, hopefully we can reduce the loss
William Jeynes
2026-03-06 17:27:09 +00:00
f14d112017
Add difference between auto scoring system and our own labels
William Jeynes
2026-03-03 15:58:39 +00:00
6ae551a93f
Ensure date is passed to pipeline. Fix woring
William Jeynes
2026-03-02 14:58:26 +00:00
c94812ed80
Prepare for mass data collection. Reduce concurrency as to not overwhelm scraper on long sessions. Remode duplicates from fetch script. Removing naming wierdness on scorer frontend.
William Jeynes
2026-02-27 14:41:10 +00:00
201176e71c
Refactor scorer for future maintainabiliy
William Jeynes
2026-02-26 10:25:49 +00:00
6c3aa7343d
Update how scoring works with two passes of the data for timesaving. Add section on edge case handling to rules.
William Jeynes
2026-02-26 10:09:36 +00:00
8317fd85df
Add file logging for errors. Add exponential backoff retry to web search. On failed web search, do not crash pipeline, return placeholder text to loanguage model
William Jeynes
2026-02-24 13:05:35 +00:00
3d0cacd24e
Redo rules a little bit. Update fetch to retreive only from some sources. Add statistics to display, fix rules display
William Jeynes
2026-02-23 21:56:27 +00:00
cca3c42f5b
Fix longstnading bug in wrapper. Add handling to allow for duplicate events to be handelled. Remove analysis script (will replace with more indepth work in main frontend)
William Jeynes
2026-02-22 23:12:14 +00:00
4d92f14527
Getting hits on the block list IMMEDIATLEY. Log to file, might be important later
William Jeynes
2026-02-22 15:42:27 +00:00
2f33338007
Do not enter existing data if it has no good trigger events
William Jeynes
2026-02-22 15:29:48 +00:00
d1ab938c0b
Add filtering from known disinformation sources
William Jeynes
2026-02-22 15:14:58 +00:00
8ffe8dec82
Use cleaned trigger events in input.jsonl
William Jeynes
2026-02-19 12:23:38 +00:00
5efce05821
Update REAME to include description of data files
William Jeynes
2026-02-19 11:43:25 +00:00
78a49e2843
Start writing cleaned jsonl output. Re-add sentence to trigger prompt. Fix recursion limit
William Jeynes
2026-02-19 11:36:31 +00:00
6f20ade780
Make open webpage more appealing
William Jeynes
2026-02-19 10:07:39 +00:00
b70b75bf28
Update readme. add human score calculation changes
William Jeynes
2026-02-18 21:05:01 +00:00
dee9973c2a
More work on scorer
William Jeynes
2026-02-18 20:44:14 +00:00
a2cb93b44e
Start refining scorer. Filter data passed to trigger event agent
William Jeynes
2026-02-18 15:03:13 +00:00
3f14b61cd4
Move all data to own folder. Add run shell script. Experiment (unsuccessfully so far) with example retreival
William Jeynes
2026-02-16 22:42:13 +00:00
90894b2c10
Add some preliminary analysis
William Jeynes
2026-02-16 14:42:47 +00:00
6d478fe7ec
Add multi claim runner. Add dbkf fetcher for automated testing. Add visualisation tool plus human score enterer.
William Jeynes
2026-02-13 15:15:01 +00:00
fa6e7017b0
Add run scripts
William Jeynes
2026-02-12 23:52:15 +00:00
7fe63d6a98
Refactor calculating score. Add sort node for vanity
William Jeynes
2026-02-12 23:46:00 +00:00
b06c08daab
Add relation model. Add calculate score initial version
William Jeynes
2026-02-12 23:26:59 +00:00
c89f73e138
Implement RAGAS metrics
William Jeynes
2026-02-12 22:52:22 +00:00
6dd6bf7eaf
implement verification model
William Jeynes
2026-02-12 22:32:24 +00:00
bef856d53a
Refactor example retreiving, add option for dynamic data. Add hybrid reranking to tooling. Add parsing and loop infrastructure for trigger event processing
William Jeynes
2026-02-12 14:33:12 +00:00
06a302ec36
remove .keep
William Jeynes
2026-02-09 21:46:28 +00:00
adccbd5740
cleanup requirements.txt for ragas service
William Jeynes
2026-02-09 21:45:56 +00:00
eba5eb40a2
Add RAGAS initial version
William Jeynes
2026-02-09 21:26:54 +00:00
cd2c8621e8
FEAT: implement temp version of main tooling feedback loop
William Jeynes
2026-02-09 20:25:36 +00:00
5841e8a922
add search query
William Jeynes
2026-02-09 16:45:17 +00:00
02eac0f553
Allow multiple source CSV files for normalisation. Implement real model node. Add normalizarion prompt. Implement normalization setup. Start on RAG retreival functions
William Jeynes
2026-02-09 16:32:40 +00:00
8eaa7bfbff
Add initial code for retreival ranking for normalisation
William Jeynes
2026-01-29 21:53:38 +00:00
a1373da891
create final nodes
William Jeynes
2026-01-28 22:03:21 +00:00
c6416622e4
start adding dummy nodes
William Jeynes
2026-01-28 21:26:34 +00:00
a3201d17a2
add initial testing
William Jeynes
2026-01-27 22:57:49 +00:00
fdf8be2414
Repository Structure
William Jeynes
2026-01-27 21:09:33 +00:00