Add self improvement pattern with two new prompt nodes
This commit is contained in:
@@ -0,0 +1,30 @@
|
||||
You are an impartial and meticulous evaluator assessing LLM’s response based on key quality dimensions
|
||||
of honesty and usefulness. Your goal is to provide structured feedback that can be used to improve the
|
||||
response.
|
||||
Evaluation task: please follow these steps carefully:
|
||||
1. Analyze the response based on the three dimensions below.
|
||||
2. Provide justifications first: write a brief explanation justifying your assessment for each dimension.
|
||||
3. Assign scores after justification: assign a score from 1 (poor) to 10 (excellent) for each dimension
|
||||
based on your justification.
|
||||
4. Synthesize: provide a brief overall impression and the single most important suggestion for improve-
|
||||
ment, keeping in mind that explanation/honesty is the top priority, then followed by guidance.
|
||||
Critique dimensions (evaluate in this order):
|
||||
(1) Speficicity and usefullness: Can the proposed event be used to create a dataset of concrete events mapped to later
|
||||
disinformation.
|
||||
(2) Existance: Using the context provided, can the user be certain that the proposed trigger event actually happened
|
||||
(3) Causality: Is there a possible link from the proposed trigger event to the disinformaiton at hand
|
||||
Overall impression & key improvement suggestion: Briefly summarize the overall quality and state the
|
||||
most critical change needed to improve the response.
|
||||
|
||||
Disinformation query:
|
||||
###NTITLE###
|
||||
Disinformation date:
|
||||
###CDATE###
|
||||
|
||||
LLM’s response to evaluate:
|
||||
###LM###
|
||||
|
||||
Provided context:
|
||||
###VESEARCHES###
|
||||
|
||||
Let's think it through step by step
|
||||
@@ -15,6 +15,10 @@ export async function hydratePrompt(path: string, state: any) : Promise<string>
|
||||
raw = raw.replace("###LM###", state.messages.at(-1).content);
|
||||
}
|
||||
|
||||
if (raw.indexOf("###L2M###") != -1) {
|
||||
raw = raw.replace("###L2M###", state.messages.at(-2).content);
|
||||
}
|
||||
|
||||
if (raw.indexOf("###NTITLE###") != -1) {
|
||||
raw = raw.replace("###NTITLE###", state.normalizedClaim);
|
||||
}
|
||||
@@ -33,5 +37,12 @@ export async function hydratePrompt(path: string, state: any) : Promise<string>
|
||||
raw = raw.replace("###TESEARCH###", output)
|
||||
}
|
||||
|
||||
if (raw.indexOf("###VESEARCHES###") != -1) {
|
||||
const output = state.evalTriggerEvent
|
||||
.map(e => e.context)
|
||||
.join("\n")
|
||||
raw = raw.replace("###VESEARCHES###", output)
|
||||
}
|
||||
|
||||
return raw;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,40 @@
|
||||
You are an expert editor tasked with making targeted improvements to an existing LLM’s response based
|
||||
on a specific critique with the primary goal of enhancing its score according to evaluation standards while
|
||||
preserving its strengths.
|
||||
Your revision task: generate a revised version of the existing response. Your goal is not to rewrite it
|
||||
completely, but to make precise edits only to address the specific weaknesses highlighted in the critique.
|
||||
Instructions for editing:
|
||||
- Identify specific flaws: carefully read the critique and pinpoint the exact issues raised (e.g., unclear
|
||||
explanation, vagueness, inappropriate responses, the key suggestion).
|
||||
- Perform minimal targeted edits: modify only the necessary sentences or paragraphs within the existing
|
||||
response to directly fix these identified flaws.
|
||||
- Strongly preserve strengths: crucially keep all other parts of the existing response intact. Do not
|
||||
rephrase, restructure, or remove sections that were not criticized or likely contributed positively to its
|
||||
initial score.
|
||||
- Ensure coherence: verify that your targeted edits integrate smoothly and do not introduce contradictions
|
||||
or awkward phrasing.
|
||||
Output requirements:
|
||||
- It should feel like a slightly polished or corrected version of the existing response, not a fundamentally
|
||||
different answer.
|
||||
- Do not mention the critique, scores, or the editing process. The output should be clean json that passes validation checks
|
||||
|
||||
Again, use a JSON format with each entry containing "Event,ReasoningWhyRelevant,SearchQuery,Url,Date".
|
||||
Use tools available to you if further information is required
|
||||
|
||||
Add no new events, only improve the existing items
|
||||
|
||||
Disinformation query:
|
||||
###NTITLE###
|
||||
Disinformation date:
|
||||
###CDATE###
|
||||
|
||||
LLM’s response to improve:
|
||||
###L2M###
|
||||
|
||||
Citique:
|
||||
###LM###
|
||||
|
||||
This contains specific feedback, justifications, scores from 1 to 10, and potentially a key improvement
|
||||
suggestion. Focus on the justifications for low scores and the key suggestion.
|
||||
|
||||
Let's think it through step by step
|
||||
Reference in New Issue
Block a user