Simulated Annealing
The Cooling Schedule That Governs Learning
Simulated annealing is the mathematical backbone of GESA. It determines how bold or conservative the system should be at any point in its learning history.
The Metallurgy Origin
In metallurgy, annealing is the process of heating a metal to a high temperature and then cooling it slowly and deliberately. The cooling schedule determines the final quality of the metal:
- Fast cooling → Brittle, disordered crystalline structure (local optima)
- Slow, controlled cooling → Strong, well-ordered structure (global optima)
The key insight: if you cool too quickly, atoms lock into the first stable configuration they find — which may not be the best one. Slow cooling gives them time to explore alternative arrangements and find the most stable overall structure.
In optimization, this maps directly to the exploration vs exploitation tradeoff:
- High temperature → Accept suboptimal moves. Explore widely. Escape local traps.
- Low temperature → Narrow toward proven solutions. Exploit what works.
The Temperature Formula
Temperature(t) = T₀ × α^t
Where:
T₀ = initial temperature (exploration budget)
α = cooling rate (0 < α < 1)
t = episode countAs episodes accumulate (t increases), temperature decreases. The system becomes progressively more conservative as it learns what works.
Example Trajectory
With T₀ = 100 and α = 0.95 (standard profile):
| Episode Count | Temperature | Behaviour |
|---|---|---|
| 0 | 100.0 | Full exploration — anything considered |
| 10 | 59.9 | Mostly exploring, some exploitation |
| 20 | 35.8 | Balanced |
| 50 | 7.7 | Mostly exploiting proven strategies |
| 100 | 0.6 | Near-full exploitation |
Why Temperature Matters for GESA
Without a temperature schedule, GESA would either:
- Always be conservative (exploit proven strategies, never discover better ones)
- Always be exploratory (never converge on optimal strategies)
The annealing schedule resolves this: the system earns the right to be conservative by exploring first.
The Acceptance Rule
At high temperature, GESA includes bold, unproven, novel candidates in the generation set. At low temperature, these are filtered out — only validated, historically-supported strategies pass the selection stage.
This means early in a system's life, GESA might recommend interventions that didn't exist in any prior episode. Later, it converges on what the episode history has proven to work.
The StratIQX Temperature Schedule
Before GESA was named, the temperature schedule was already running in production as StratIQX's four-tier depth system:
| Depth Tier | Tokens/Section | Price Range | GESA Temperature |
|---|---|---|---|
| Quick | 512 | $1.25K–$2.5K | Low (exploitation) |
| Standard | 1,024 | $5K–$10K | Medium |
| Comprehensive | 2,048 | $15K–$37.5K | High |
| Enterprise | 2,048+ | $50K–$100K | Maximum (exploration) |
Higher temperature = wider exploration = higher value = higher price. The annealing schedule has a commercial model.
The "configurable pauses (500ms–1500ms) between agents" is pacing — the cooling rate made operational.
Temperature Profiles
GESA defines four built-in cooling profiles. See Temperature Profiles for the full specification.
| Profile | α | Use When |
|---|---|---|
| Fast Cool | 0.85 | Domain well-understood; convergence speed matters |
| Standard | 0.95 | Default; mixed exploration and exploitation |
| Slow Cool | 0.99 | Problem space unknown; premature convergence risk high |
| Adaptive | f(variance) | Self-tuning based on episode outcome variance |
Biological Correspondence
Young cormorants explore widely — diving in varied locations, trying different techniques, accepting failed dives as learning experiences. Experienced cormorants exploit proven zones. The same species, the same biology, but different temperatures.
Young cormorant → T = 90 Explore widely, accept failures
Adult cormorant → T = 20 Exploit proven hunting grounds
Expert cormorant → T = 5 Near-certain dives in known zonesGESA is the translation of this natural annealing schedule into a deliberate, observable, tunable system.