SLO-based automation for Kubernetes. Audit your alerts. Automate your responses. Start safe with observe mode, graduate to auto.
Perfect for Platform Teams • Kubernetes-native • Start safe, graduate to auto • Self-hosted, privacy-first
Your availability drops to 97% at 2am. By morning, you've burned through a week's error budget.
Page the on-call, investigate, decide, execute. 20 minutes to scale up while customers suffer.
HPA scales on CPU. But what about error rates? Latency spikes? Database connection issues?
78% of SLO violations happen outside business hours when response time is slowest
Fix problems in your alerts, then automate responses to SLO violations. Use separately or together.
Find problems in your Prometheus alerts with expert SRE knowledge and AI-powered explanations.
Monitor SLOs in real-time and auto-remediate when they burn. Trust ladder: observe → dry-run → auto.
Install. Define SLOs. Start in observe mode. Graduate to auto as trust builds.
Helm chart or wheel download after purchase. Works with any Prometheus.
Set targets: 99.9% availability, P95 latency < 500ms.
See what Reflex would do. Build confidence.
Fix issues while you sleep. SLOs stay healthy.
apiVersion: slo.reflex.io/v1
kind: SLO
metadata:
name: api-availability
spec:
service: "checkout-api"
target: 99.9 # 99.9% availability
indicators:
- name: success_rate
query: "rate(http_requests_total{code!~'5..'}[5m])"
• Availability target: 99.9%
• Latency target: P95 < 500ms
• K8s health checks
apiVersion: reflex.io/v1
kind: Reflex
metadata:
name: api-scale
spec:
automationLevel: observe # Start safe
trigger:
burnRate: "> 2.0" # 2x burn rate
action:
type: scale
target: "deployment/checkout-api"
parameters:
replicas: "+2"
✓ Scale when SLO burns too fast
✓ Restart on error rate spikes
✓ Notify team on violations
✓ Rollback bad deployments
Build confidence gradually. Start safe, graduate to full automation when ready.
See recommendations without any action. Perfect for understanding what Reflex would do and building confidence.
Simulate actions before executing. See exactly what would happen, validate safety, then commit to automation.
Full automation with safety guardrails. Fix issues automatically while you sleep. Your SLOs stay healthy.
Automate remediation across 50+ microservices. SLOs stay healthy while you focus on feature work.
Sleep through the night. Reflex handles the routine scaling, restarts, and alerts automatically.
Go beyond CPU-based HPA. Scale on error rates, latency, and custom SLO metrics automatically.
Choose your products. Start free with alert linting and runtime observe mode.
| Free | Audit | Runtime Starter | Runtime Pro | |
|---|---|---|---|---|
| Price | $0 | $99 one-time |
$149/mo | $299/mo |
| Alert linting (limited) | ✓ | ✓ | ✓ | ✓ |
| Full Audit + AI | ✗ | ✓ | ✓ | ✓ |
| Runtime observe | ✓ | ✗ | ✓ | ✓ |
| Runtime actions | ✗ | ✗ | ✓ | ✓ |
| Multiple clusters | ✗ | ✗ | ✗ | ✓ |
| Start Free | Buy Now | Start Trial | Contact |
Perfect for getting started and learning
Deep alert analysis with AI insights
Full trust ladder with automation
14-day free trial
Multiple clusters and priority support
Observe mode lets you see all recommendations Reflex would make. Perfect for learning and building confidence.
No. Reflex runtime works completely offline after installation. Self-hosted and privacy-first.
Your reflexes switch back to observe mode. You keep all the SLO monitoring and recommendations.
Yes! 30 days, no questions asked. If Reflex doesn't improve your SLO reliability, get a full refund.
python3 scripts/install.py --prometheus-url http://your-prometheus:9090
kubectl apply -f - <
kubectl apply -f examples/reflexes/traffic-spike-scale.yaml
reflex runtime status # See recommendations
reflex runtime set api-scale --automation-level auto
Join the platform teams using Reflex to automatically fix issues while they sleep. Start safe with observe mode, graduate to full automation when ready.