Constraint verification that catches what LLMs miss. Adversarial-robust, self-learning,
real-time guided decoding. Built in Rust + JAX. pip install carnot
Autoregressive models commit to tokens one by one. Carnot introduces a holistic verification layer that treats the entire output as a single energy state.
Token-by-token generation has no global consistency check. One wrong token poisons the entire sequence with no way to backtrack.
Energy-Based Models (EBMs) assign energy to the whole output. Low energy = valid. High energy = violation. Repair is done via gradient descent.
KAN (default): learnable splines, 0.994 AUROC, 8.7x fewer params — best for verification. Ising: fastest sampling, maps to FPGA/TSU hardware — best for real-time guided decoding. Gibbs: MLP for complex patterns. Boltzmann: deep residual for research. KAN for accuracy, Ising for speed.
VerifyRepairPipeline: extract constraints, verify via Ising, repair with LLM feedback. +15% on adversarial math, +6% on HumanEval. 5 lines of Python.
Constraint energy steers token generation in real-time. 0.006ms per check — 100x faster than a single token generation step. Kona-style reasoning.
Apple proved LLMs can't do math (accuracy drops 21% with irrelevant sentences). Carnot recovers +15% because Ising verification ignores irrelevant context entirely.
ConstraintStateMachine tracks verified facts across multi-step agent workflows. Rollback to last consistent state on violation. 60/60 violations caught in workflow benchmark.
Gets smarter with use: 67.6% → 97.0% over 500 questions. Online constraint generation from memory patterns, persistent across sessions. 96% factual claim coverage via Wikidata.
from carnot.pipeline import VerifyRepairPipeline
# Verify any LLM output in 3 lines
pipeline = VerifyRepairPipeline()
result = pipeline.verify("What is 47 + 28?", "47 + 28 = 76")
print(result.verified) # False
print(result.violations) # [47 + 28 = 76 (correct: 75)]
# Auto-repair with LLM feedback loop
fixed = pipeline.verify_and_repair(
"What is 47 + 28?", model="Qwen/Qwen3.5-0.8B"
)
print(fixed.final_response) # "47 + 28 = 75"