Defines the contract for integrating GEPA with external systems.
Adapters must implement evaluation, reflective dataset construction, and optionally custom proposal logic.
Type Parameters
Adapters work with three user-defined types:
data_inst: Input data structure for taskstrajectory: Execution trace structure for reflectionrollout_output: Program output structure
Required Callbacks
evaluate/4: Execute program on batch and return scoresmake_reflective_dataset/4: Extract feedback from execution traces
Optional Callbacks
propose_new_texts/3: Custom instruction proposal logicget_adapter_state/1: Snapshot adapter-owned persistent stateset_adapter_state/2: Restore adapter-owned persistent state after resume
Example Implementation
defmodule MyAdapter do
@behaviour GEPA.Adapter
@impl true
def evaluate(adapter, batch, candidate, capture_traces) do
# Run program on batch
# Return {:ok, %GEPA.EvaluationBatch{}}
end
@impl true
def make_reflective_dataset(adapter, candidate, eval_batch, components) do
# Extract feedback from trajectories
# Return {:ok, dataset_map}
end
end
Summary
Callbacks
Evaluate a candidate program on a batch of data.
Optional: Return adapter-specific state for checkpoint persistence.
Build reflective dataset from execution traces.
Optional: Custom instruction proposal logic.
Optional: Custom instruction proposal logic with adapter state.
Optional: Restore adapter-specific state from a checkpoint.
Types
Callbacks
@callback evaluate( adapter :: term(), batch :: [data_inst()], candidate :: candidate(), capture_traces :: boolean() ) :: {:ok, eval_batch()} | {:error, term()}
Evaluate a candidate program on a batch of data.
Parameters
batch: List of data instances to evaluateadapter: Adapter struct or statecandidate: Program as map of component name -> component textcapture_traces: Whether to capture execution trajectories for reflection
Returns
{:ok, eval_batch}: Successful evaluation with outputs, scores, and optional trajectories{:error, reason}: Systemic failure (configuration error, missing dependencies, etc.)
Contract
- Never raise on individual example failures - return failure scores instead
length(eval_batch.outputs) == length(eval_batch.scores) == length(batch)- If
capture_traces=true, must populateeval_batch.trajectories - Scores should be >= 0, higher is better
- Failed examples should return low scores (e.g., 0.0)
Scoring Semantics
- GEPA uses
sum(scores)for minibatch acceptance testing - GEPA uses
mean(scores)for validation set tracking
Optional: Return adapter-specific state for checkpoint persistence.
The returned value must be a fresh map and should be safe to serialize. GEPA
stores it opaquely in GEPA.State.adapter_state and never inspects it.
@callback make_reflective_dataset( adapter :: term(), candidate :: candidate(), eval_batch :: eval_batch(), components_to_update :: [String.t()] ) :: {:ok, reflective_dataset()} | {:error, term()}
Build reflective dataset from execution traces.
Extracts actionable feedback from trajectories to guide instruction refinement.
Only called when evaluate/4 was called with capture_traces=true.
Parameters
candidate: The candidate that was evaluatedadapter: Adapter struct or stateeval_batch: Results from evaluate/4 with trajectoriescomponents_to_update: Subset of component names to generate feedback for
Returns
{:ok, dataset} where dataset is a map from component name to list of feedback records.
Recommended Record Schema
%{
"Inputs" => %{...}, # Minimal view of inputs to component
"Generated Outputs" => "...", # What the component produced
"Feedback" => "..." # Performance feedback, errors, suggestions
}Contract
- Dataset must be JSON-serializable (will be embedded in LLM prompts)
- Feedback should be actionable and concise
- Only generate datasets for components in
components_to_update - If using randomness for sampling, seed the RNG for determinism
@callback propose_new_texts( candidate :: candidate(), reflective_dataset :: reflective_dataset(), components_to_update :: [String.t()] ) :: {:ok, %{required(String.t()) => String.t()}} | {:error, term()}
Optional: Custom instruction proposal logic.
Override default LLM-based proposal with task-specific logic.
If not implemented, GEPA uses GEPA.Strategies.InstructionProposal.
Parameters
candidate: Current candidate programreflective_dataset: Feedback dataset from make_reflective_dataset/4components_to_update: Components to propose new text for
Returns
{:ok, new_texts} where new_texts is a map from component name to new text.
Use Cases
- Custom LLM prompting strategies
- Non-LLM based proposal (templates, rules, etc.)
- Multi-component joint optimization
@callback propose_new_texts( adapter :: term(), candidate :: candidate(), reflective_dataset :: reflective_dataset(), components_to_update :: [String.t()] ) :: {:ok, %{required(String.t()) => String.t()}} | {:error, term()}
Optional: Custom instruction proposal logic with adapter state.
Prefer this arity for new adapters. propose_new_texts/3 remains supported
for backward compatibility with early gepa_ex adapters.
Optional: Restore adapter-specific state from a checkpoint.
Mutable adapters may update their internal process/resource and return :ok.
Pure data adapters may return an updated adapter struct; the current engine
treats that as advisory because adapter structs are passed in user config.