Self-improving agents. Zero hand-holding.
Open-source infrastructure for Claude Code skills that detect failures, evolve configurations, and earn trust through reliability.
Autonomous Bounties
Skills detect their own failures. Create structured improvement requests. Resolve issues without human debugging.
Evolution
Controlled experiments on configurations. Statistical winner selection. Zero-downtime deployment of improvements.
Trust Engine
Agents progress from novice to autonomous through demonstrated reliability. Capabilities unlock as trust accumulates.
Why we build in the open
The power of AI agents should be accessible to everyone, not locked behind enterprise contracts or proprietary platforms. CatalystRL is our contribution to that future.
Software has come full circle: software to AI to software. We use traditional software for directed behavior: deterministic scripts, validation gates, structured workflows. We use AI for self-directed behavior: reasoning, adaptation, autonomous decisions. The hybrid yields agents that are both reliable and intelligent.
Every claim we make is backed by tests and data. Trust isn't assumed. It's earned through measured performance. Our agents track their own success rates, and we publish what we learn.
Open source is how we grow. Contributors bring fresh perspectives. Users surface edge cases we'd never find alone. The data we gather collectively pushes self-improving agents forward faster than any closed system could.
Most agent frameworks stop at code generation. We go further. CatalystRL agents cover the full journey from inception to delivery: research, planning, architecture, implementation, testing, documentation, deployment. Not just engineering, but every facet of building something real.
Skill agents are our Catalyst toward self-improving AI. Each skill is a building block that learns from its mistakes, earns trust through reliability, and evolves through real-world usage. The name says it all.
The 15 Tenets
Every CatalystRL agent is built on these principles.
Self-Healing
Agents detect and fix their own problems automatically.
Earned Trust
More capabilities unlock as agents prove reliability.
Always Improving
Continuous optimization through controlled experiments.
Built-in Safety
Guardrails and checkpoints at every critical step.
Perfect Memory
Context persists across sessions. Nothing is forgotten.
Blueprint Driven
Every agent follows defined blueprints. The Golden Blueprint guides the system.
Observability
See exactly what agents are doing in real-time.
Evaluations
Agents are measured and scored on real performance.
Reliable Scripts
Deterministic operations that work every time.
Team Players
Agents collaborate and hand off work seamlessly.
Picks Up Where Left Off
Progress is never lost between sessions.
Standalone
Just Claude and software. No extra infrastructure.
Error Prevention
Validates inputs and catches mistakes early.
Performance Scores
Know exactly how well each agent performs.
Smooth Handoffs
Context flows naturally between agents.
Start building
The Golden Blueprint is ready.
View on GitHub